In this article, we will try to outline the structure of an AI system in a simple way as layman as possible . We ignore the technical details, since the aim of this article is to convey only basic knowledge. We received daily many queries regarding this “How to build my own artificial intelligence system ?” or “what is required to build an ai system?”
Compared to traditional computer programming, in which the software does not automatically improve, an AI system is built up significantly differently.
The following picture shows the idea behind a good AI engine:
It should be noted that over time, building AI systems has not only become much less complex , but also much less expensive . How to create an AI from scratch? One of the illustrative examples is Amazon Machine Learning , which can be used to automatically classify products in the catalog using the product description data as a training set.
For example, imagine that you spent 20 hours of computing time generating your models and received 89,000,000 real-time forecasts over a month. Your cost would have been $ 100.
To understand this example, one should focus on machine learning, as this area covers most applications. An important note: In order to successfully get into AI, you need a good understanding of the statistics.
Steps to develop an AI system:
- Identify the problem.
- Preparation of the data.
- Choice of algorithms.
- Training the algorithms.
- Choosing the most suitable programming language.
- Platform selection.
1. Identify the problem
First of all, the following important questions should be answered: ( 1) What are you trying to solve? (2) Which result is desired?
You always have to remember that AI cannot be the panacea in itself . It is a tool and not the entire solution. There are several methods and many different problems that need to be solved using AI.
The following analogy should help to understand the above: In order to prepare a delicious dish, you have to know exactly which dish is being prepared and which ingredients are required.
2. Preparation of the data
The first thing to look at is the data. They are divided into structured and unstructured data.
The term structured data generally refers to all types of data that are structured or organized in any way to ensure consistency in processing and easy analysis. As a simple example of structured data, a customer record can be given with the first and last name, date of birth, address and other data .
On the other hand, unstructured data are available in a non-formalized structure. They can include audio, pictures, symbols, words and infographics. Simple examples include emails, a phone call, a WhatsApp or WeChat message.
The breakthrough and one of the greatest benefits of AI has been to enable computers to analyze unstructured data and access a much larger universe of data than the world of structured data.
It is a mistake to think that the most important components of AI are complex algorithms. In fact, the most important part of AI toolkits is data cleansing. Typically, data scientists spend 80% of their time cleaning , moving, reviewing, and organizing the data before using it or writing a single algorithm.
Businesses and large corporations have massive proprietary databases, the data of which may not be AI-ready, and it is very common for data to be stored in silos. This can duplicate information, some of which match and some of which can contradict. Ultimately, companies could be restricted by these data silos to gain quick insights into their internal data.
Before running the models, it is necessary to ensure that the data has been properly organized and cleaned up. This means checking consistency, determining a chronological order, labeling the data as needed, and so on.
In general, there is a rule that the more the data is massaged, the greater the likelihood that the result will be found to solve the specified problem.
3. Choice of algorithms
As already mentioned, technical details are not dealt with in this article, but nevertheless it is necessary to provide a brief overview of the various common types of algorithms, which also depend on the type of learning selected.
1. Supervised learning
Basically, a label is predicted for classification and a lot is predicted for regression.
An illustrative example of using a classification algorithm could be a scenario in which it should be determined whether a loan would fail or not.
An illustrative example of using a regression algorithm could be a scenario where the amount of expected loss for these failed loans should be quantified. In this regard, a value is sought: what is the amount of euro that is likely to be lost in the event of loan default?
Once the problem has been identified, the next step can be taken – choosing the algorithm.
These scenarios are simplistic and not very realistic in practice. In supervised learning you can choose from other algorithms such as logistic regression, random forest, support vector machine and naive Bayes classification.
However, these examples are necessary to understand the types of algorithms in AI well.
2. Unsupervised learning & reinforcing learning
Here types of algorithms would be very different and could be divided into different categories such as B. Clustering, in which the algorithm is to find similar objects, association in which it is to find connections between objects, dimension reduction, in which it is to reduce the number of variables to reduce noise.
4. Training the algorithms
After the selection of the algorithms has been completed, the model must be trained and the data entered into the model. Model accuracy is critical here. Although there are no internationalized or generally accepted thresholds, it is extremely important that the model accuracy is determined within the selected framework. The determination of an acceptable minimum threshold and the use of a large statistical discipline play a decisive role. The model has to be trained again, since it goes without saying that the models may need to be fine-tuned. For example, in the case of the result with a reduced predictability of the model, it has to be revised and all of the above-mentioned steps have to be checked.
5. Choosing the most suitable programming language for AI
This depends on needs and many factors. Nowadays, data scientists and simple users have access to a wide range of programming languages, from classic C ++ and Java to Python. Python and R are both currently the most popular and widely used programming languages in data science. Both are very powerful programming languages (for data analysis), especially due to the many packages and extensive libraries for machine learning. One of the very powerful libraries for computational linguistics is NLTK (the Natural Language Toolkit). It is a compilation of libraries and programs in the Python programming language .
6. Selection of the platform
You don’t necessarily have to buy your own service, database, etc. There is the possibility to choose a pre-built platform that offers all services.
These pre-built platforms (machine learning as a service) were one of the most useful parts of the infrastructure thanks to which machine learning has spread. The goal of developing these platforms was to facilitate and simplify machine learning. They often deliver advanced, cloud-based analytics that work with multiple algorithms and languages and can integrate them.
Rapid deployment is crucial to the success of machine learning as a service. Typically, platforms help solve problems such as data preprocessing, model training, and valuation prediction. Because they differ, a certain pre-evaluation is crucial.
The most popular platforms are Microsoft Azure Machine Learning , the Google Cloud Prediction API , TensorFlow , Ayasdi and others.
If you have any questions about setting up an AI system, you can contact the DFIclub team by email or Q&A .
- Amram is a technical analyst and partner at DFI Club Research, a high-tech research and advisory firm .He has over 10 years of technical and business experience with leading high-tech companies including Huawei,Nokia,Ericsson on ICT, Semiconductor, Microelectronics Systems and embedded systems.Amram focuses on the business critical points where new technologies drive innovations.