Basic data of science

4/6/2023

The disadvantages are generally when data science is used for customer profiling and infringement of customer privacy.Introduction to Data Science also helps consumers search for better goods, especially in e-commerce sites based on the data-driven recommendation system.Companies can develop and market their products better as they can better select their target customers.It helps to optimize the business, hire the right persons and generate more revenue, as using data science helps you make better future decisions for the business.It helps us to get insights from historical data with its powerful tools.Advantages and Disadvantages of Data Scienceīelow are the advantages and disadvantages mentioned: Advantages: The Government can use data science to prepare better policies to cater to the needs of the people and what they want using the data they can get by conducting surveys and others from other official sources. Banking and FinanceĪs we discussed the introduction to data science now, we will go ahead with applying data science uses in the banking sector for fraud detection, which can help reduce the Non-Performing Assets of banks. The data generated from the body can be used in healthcare to prevent future emergencies. Using wearable data to prevent and monitor health problems. There is a huge scope in marketing for example, Improved Pricing strategy Companies like Uber, e-commerce companies can use data science-driven pricing, increasing their profits. Here are examples of a few sectors where data science can be used or being used actively. Here in the introduction to data science, we have cleared about data science applications that it is huge. New methods to solve familiar problems are being developed constantly, so, as a data scientist, curiosity to learn emerging technologies becomes very important. It has been there before also, but the progress being made in this field is very fast. Curiosityĭata Science is not a new field. They also need the intuition to know at what point the production model is stale and needs refactoring to respond to changing business environment.

So a data scientist needs to feel when a model is ready for production deployment. IntuitionĪlthough the math involved is proven and foundational, a data scientist needs to pick the right model with the right accuracy as all models will not give up the same results. So, domain knowledge of the business also becomes important or helpful. A data scientist needs to understand the business requirement and develop analytics according to them. It is the most important characteristic unless you understand the business you cannot make a good model even if you have good knowledge of machine learning algorithms or statistical skills. Once we get the desired result by proper testing as per the business requirements, we finalize the model, which gives us the best result as per testing results and deploys the model in the production environment. In case we do not get the desired accuracy, we can again go to step 2(modelling), select a different model, and then repeat the same step 3 and choose the model which gives the best result as per the business requirement. The model is tested with test data to check the model’s accuracy and other characteristics and make the required changes in the model to get the desired result. It is the next step and very important concerning the performance of the model. Once the model is decided, we fit the data into the model. For example, the model selection for recommending an article to a customer will be different than the model required for predicting the number of articles that will be sold on a particular day. The selection of a model depends on the type of data we have and the business requirement. Here we actually fit the data into the model. This is the second step, where we actually use Machine Learning algorithms. So, by now, our data is prepared and ready to go. Hence this is one of the most time-consuming steps.

So basically, the data is transformed and readied for further use. This step is also used to check the relationship among various features(columns) in the data set by the relationship, we mean whether the features(columns) are dependent on each other or independent of each other, whether there are missing values data or not. This step involves sampling and transformation of data in which we check the observations (rows) and features (columns) and remove the noise by using statistical methods. The noise here means a lot of unwanted data that is not required. There is a lot of noise present in the data. The main ingredient for data science is data, so when we get data, it is seldom that data is in a correct structured form. Around 70 per cent of the time is spent on data exploration. It is the most important step, as this step consumes the most amount of time.

0 Comments

Basic data of science

Leave a Reply.

Author

Archives

Categories