Natural Language Processing. This is one of the most popular fields in Data Science and AI, with which we have points of contact every day, as Internet users.
NLP is all around us, as auto-correction, text translation and prediction, email filters, smart assistants, digital phone calls, search results, and more. The range of possibilities is huge because this technology is already widely used for E-commerce, e-governance, education, and health.
From this point, NLP is a Data Science domain that offers tremendous prospects and career opportunities. That’s why for the final project the second team of students from Data Science Bootcamp choose to develop Rubik – Brainster’s new NLP virtual assistant.
You will have the opportunity to meet Rubik soon. He is currently dedicated to learning, so he can be ready to answer all of your questions. Today we will talk to the team that stands behind his creation. The team is a mixed combination of mentors and students, Alexander, Philip, Martina, and Gabi.
If you want to work on projects like this one and add it to your portfolio tо provide you great advantages in the labor market, apply to the Fall Data Science Bootcamp.
Brainster: Colleagues, behind you, is a year of hard work. This has resulted in a very advanced project, in the Natural Language Processing field. Perhaps the best indicator of progress is to tell us what preconceptions you had before enrolling at the Data Science Bootcamp?
-From this time distance, the progress we have made has exceeded all our expectations. We started this story with basic SQL knowledge, an affinity for mathematics, and extensive experience in data analysis.
Brainster: NLP is very modern technology. It is widely used in the business of Sentiment Analysis and Chatbot – virtual assistants. Your task was to create a Rubik, Brainster’s new virtual assistant. What exactly was the motive behind this project and where do you see its implementation?
– The motive for Rubik’s development is the need to provide a higher level of service to Brainster’s potential clients. A successful implementation will mean automatically answering potential clients’ questions, in real-time. This will free up time, for employees to commit better quality tasks. With the adoption of modern technologies in working, Brainster will be confirmed in the role of an “early adopter”. They will be a good example for companies that are operating in the Macedonian market.
Brainster: You faced with the painstaking work of creating your own dataset. How did a team of 4, manage to reach a number of 3000+ questions? How did this process go?
-Step by step. By setting specific and achievable goals and with effective feedback from every team member. First, we put ourselves in the role of a potential candidate. We made a segmentation, we evaluated the generated questions, we created appropriate answers, we discussed, we modified, we supplemented.
– The first goal was a base of 300 questions. We finished with a 10 times bigger base, and with – “I don’t want anyone to ask me a question :)”
Brainster: Rubik is still in the testing phase. But he is training hard to be available as soon as possible to anyone that is interested in Brainster Bootcamp. Interestingly, the future students of the Bootcamp will work on its continuous improvement. What do you expect from Rubik’s further development?
-To improve the “response time” and instant solution to common questions. Potential candidates should get a concrete answer to the first question, without any additional sub-questions. To expand the scope and re-train the model of already asked questions, to increase its accuracy.
Brainster: Google is one of the companies that implement NLP in many of its products such as Google Assistant, Google Translate, Search, Autocorrect, and Targeted advertising. Therefore, this technology is very promising for anyone interested in a Data Science career. How competent do you feel to be working in this field after graduating from the Bootcamp?
-The participation in this project has given us a solid foundation for further work in this field. It has already aroused our curiosity for further research. A well-developed chatbot is a data source for further analysis, a field in which we feel comfortable considering our previous experience.
Brainster: Where do you see the end-game of these automation process technologies? Can NLP completely eliminate the human factor from services such as support and customer relationship, translation, and hiring & recruitment?
-The full potential of chatbots is yet to be discovered. As expected, future development will be in the software that will perform operations that normally are performed by humans. After the mass application in customer care, the sophisticated NLP algorithms will be applied intensively in the above-mentioned areas.
Brainster: Will this rise in automation increase the demand for Data Scientists and Machine Learning Engineers?
-For several years now, Data Scientists and Machine Learning engineers have been high on the list of most popular professions. This trend is expected to continue as automation processes have not even begun in some industries. For example, to develop an intelligent chatbot it requires an understanding of machine learning, AI and NLP technologies, knowledge of back-end programming, and various programming languages and technologies.
Brainster: Martina and Gabi, we talked to you once at the beginning of the year. You were finishing the Statistics module. You come from the financial sector. How would you summarize the second part of the program, that was dedicated to Python, Machine Learning, and Big Data?
-The second part of the program was the real challenge for us because in the first part we can say that we walked on a “familiar ground”. From writing the first code, through data processing, building predictive models to working with Big Data, and all that, completed with a project that will find its practical application.
Brainster: You are currently working on new projects with the students of the Bootcamp, as part of our Data Science Hub. What advice will you give to the colleagues to easily cope with the responsibilities and challenges you have already done through the program?
-Crucial to the success of the Bootcamp was the huge desire to upgrade our knowledge and to have an open mind to new challenges. As well as the hard work and dedication through the past year. You should stay focused and make the weekly workshop enjoyable through teamwork.
Brainster: Finally, a question to the mentors. The 5th group of the Bootcamp is already completed. How do you see the evolution that we have gone through, in the past year? Specifically, in terms of the program and the student experience from the online lectures?
-With each passing hour, each of the students gained more self-confidence in solving the tasks and the modules they listened to. I can say that each of them was impatiently waiting for the lecture to start. They were waiting for the exercises that were planned for the current week, to see if and how they can solve the task.
The satisfaction always came with great results and a competitive spirit among colleagues. With the passage of one year and 41 workshops solved about 80 homework, several books read, and 4 solved projects, there was not a single part of the students that was not dedicated to this challenge.
The program itself can be exhausting and at times requires motivation. This is done twice a year with each student separately. We gave feedback to each of them. To point if there is a deficiency, and on what they should be concentrating on. At the same motivational conversations, we give directions to the students. If they are good at traditional data analysis, computer vision, text work, or predictions.
Follow the link to find more detailed information about the project that Team 2 worked on https://github.com/filipgd1/NLP-Chatbot
-The fear that everyone has it at the beginning is gone very fast because all of the lectures are recorded and can be viewed. Also, any question asked by the colleagues is explained in detail. If a part is omitted we can remember it. Normally, each of us is tired of a certain part of the year. But we still study without stopping for 365 days :). So online lectures allow us to listen to it later. However, we insist on being present at the moment because the discussion is the fortune for every future Data Scientist.
It is a great feeling to see how students “overcome their fears”, how their interest and curiosity grow. How much they are willing to put in the time and effort to cope with what they are doing. Colleagues managed to create a positive atmosphere filled with creativity, teamwork, and cooperation. I think we as mentors also have learned a lot from the Bootcamps.
If you have ever thought about a future-proof career in Data Science, have a look at our remote Bootcamps.