Storing and retrieving data are joint activities for business success. The increasing number of businesses and the use of data is growing at a phenomenal rate. A simple list of customers and suppliers is no longer enough for effective marketing decision-making. Modern marketing practice requires that marketing needs to be equipped with sophisticated data for marketing success.
The growing presence of web-based businesses and the expansion of global conglomerates necessitates that micro-, small- and medium enterprises are well abreast with modern marketing trends of data mining so as to have a market share. This article will take a critical look at data mining issues likely to affect the competitive growth of micro-, small- and medium enterprises.
Before I delve into the essentials of data mining issues, I want to trace the development of data mining. This new phenomenon of data mining has evolved as a result of many years assiduous work in various disciplines like databases, algorithms, information retrieval, statistics and machine-learning.
The use of terminologies which have developed into data mining language and its usage include artificial intelligence (AI), databases (DB), statistics (Stats) and information retrieval (IR). The historical antecedents of data mining has informed what data mining actually is as follows:
- Induction is used to proceed from specific knowledge to general information. This type of technique is often found in AI applications.
- The primary objective of data mining is to describe some physiognomies of a set of data by a general model, this method can be viewed as a type of compression. The detailed data within the database are abstracted and compressed to a smaller account of the data physiognomies that are found in the model.
- The data mining process can also be viewed as a type of querying the underlying database. The growing research in data mining is toward finding out how to accurately capture data mining queries for marketing decision-making.
- Labelling a large database can be viewed as using approximation to help reveal concealed information about the data.
- In dealing with a large amount of databases, the effect of size and efficiency of developing an abstract model can be thought of as a type of search
Data mining issues
There are many issues associated with data mining. I will tackle ten (10) of such issues.
- Immaterial data
The relevance of data for marketing decision-making is important for business growth. Even though data is the key ingredient for driving market growth, it is not every data that is relevant for the marketer when handling a particular marketing operation. Immaterial data refers to some portions of the database which are not required for a particular data mining in marketing decision-making.
For example, a business enterprise has in its database data of its suppliers, government agencies, customers (both local and international), potential buyers, etc. At a particular marketing operation where the enterprise wants to reach out to its international customers with new product information, data of its future buyers, local customers etc. are immaterial in the data mining process for international customers.
- High amount of data
A modern database representation comprises many different characteristics. The challenge is that not all the characteristics of the database will be required to solve a particular data mining purpose. Using some characteristics of the database may interfere with the right accomplishment of the data mining being undertaken.
The use of other attributes may upsurge the general complexity and decrease the competence of an algorithm. This problem is often referred to as the dimensionality curse, which means there are many dimensions involved and therefore it is difficult to determine which one to use. The solution is to reduce the number of dimensions – which is known as dimensionality reduction. Comparing the dimensionality curse to the dimensionality reduction, it can be deduced which dimensions are not needed and hence is easy to do.
In as much as an organisation holds high amounts of data, determining the intended use for the purpose of mining data can be a real challenge. There are multiple uses of data: therefore, the intended use is more the issue than the algorithms themselves. In instances where the data has not been previously used by the organisation for the same purpose, it becomes difficult to determine how to use the data and for what exact purpose. This will sometimes require that the business practices may have to be changed to regulate how to successfully use the data mined.
- Human interaction
It is not always the case that data mining failures are stated. Therefore, there is a need to present an interface between the domain and technical experts. The technical experts by their knowledge and training in the field are used to formulating the queries that arise from the data mining and assisting in interpreting the results.
It must be known to organisations that knowledge discovery database is an expertise field, and therefore in order for them to achieve the most out of their database usage they will have to engage the service of database experts. The experts will in turn train the users so as to achieve the desired results. It is thus necessary for organisations to invest in expertise so as to derive the necessary success when it comes to data mining.
- Clarification of results
Here again, expert knowledge is required to interpret the database results – without which the result might be meaningless to the ordinary database user. Since results generated from the database are imperative for organisational success, it will have dire consequences if the interpretation is not accurate.
- Visualisation of results
Visualisation of the result simply mean that the result of the data mining must show an understandable meaning which can be easily viewed/used for decision-making purposes. Data in its raw state has no meaning until it is processed and meaning is inferred from it for targetted organisational decisions.
- Huge datasets
Depending on the magnitude of the organisation’s operations and its business size, databases are usually huge datasets related with data mining. The huge size of the database creates a lot of problems when applying algorithms designed for small datasets. Many modelling applications grow largely on the dataset size, and thus are too disorganised for larger datasets. Sampling and parallelisation are operative tools to attack this scalability problem.
Knowledge discovery in database plays a key role in database integration. The term knowledge discovery in database and data mining are often used interchangeably. In the past, the term knowledge discovery in database has been used to refer to a process consisting of many steps, while data mining makes up one of those steps.
Knowledge discovery in database request may be treated as a special, unusual, or one-time need. This makes them ineffective, inefficient, and not general enough to be used on an ongoing basis. Experts agree that integration of data mining functions into traditional database management systems is certainly a desirable goal.
Many database entries do not fit perfectly into the derived model. This scenario becomes even more precarious with very large databases. If a developed model includes any of such outliers, that model may fall aside from those that are not outliers.
- Noisy data
Those attributes of the database which are not valid or correct must be corrected often before running data mining applications.
Data mining issues can be frustrating for organisational success. It is important that organisations desirous of using a database as a tool to their operational success engage the services of database experts; and they must also get their database management staff well-trained. In the face of the business world growing into a global one, an organisation that is not applying database management to its operations may be falling short of global trends – for which reason their market share will lag behind what they would have otherwise gained.
Investing in data mining is a must for all ambitious organisations to ensure the necessary business success.