Natural Language Processing it’s the most popular field in Data Science and AI. With NLP, we have points of contact every day, as Internet users. It is one of the must-have skills for all data scientists out there.
NLP is all around us, from auto-correction, text translation and prediction, email filters, smart assistants, digital phone calls, search results, and more.
The range of possibilities is huge because this technology is already widely used for E-commerce, e-governance, education, and health.
From this point, NLP is a Data Science domain that offers tremendous prospects and career opportunities. That’s why for the final project the second team of students from Data Science Bootcamp choose to develop Rubik – Brainster’s new NLP virtual assistant.
You will have the opportunity to meet Rubik soon. He is currently dedicated to learning, so he can be ready to answer all of your questions. Today we will talk to the team that stands behind his creation. The team is a mixed combination of mentors and students, Alexander, Philip, Martina, and Gabi.
If you want to work on projects like this one and add it to your portfolio tо provide you great advantages in the labor market, apply to the Fall Data Science Bootcamp.
Brainster: Colleagues, behind you, is a year of hard work. This resulted in a very advanced project, in the Natural Language Processing field. Perhaps the best indicator of progress is to tell us what preconceptions you had before enrolling at the Data Science Bootcamp?
– From this time distance, the progress we have made has exceeded all our expectations. We started this story with basic SQL knowledge, an affinity for mathematics, and extensive experience in data analysis.
Brainster: NLP is very modern technology. It is widely used in sentiment analysis in business and Chatbot – virtual assistants. Your task was to create a Rubik, Brainster’s new virtual assistant. What exactly was the motive behind this project? Where do you see its implementation?
– The motive for Rubik’s development is the need to provide a higher level of service to Brainster’s potential clients. A successful implementation will mean automatically answering potential clients’ questions, in real-time. This will free up time, for employees to commit better quality tasks. With the adoption of modern technologies in working, Brainster will be confirmed in the role of an “early adopter”. They will be a good example for companies that are operating in the Macedonian market.
Brainster: You faced with a painstaking work of creating your own dataset. How did a team of 4, manage to reach a number of 3000+ questions? How did this process go?
– Step by step. By setting specific and achievable goals and with effective feedback from every team member. First, we put ourselves in the role of a potential candidate. We made a segmentation, we evaluated the generated questions, we created appropriate answers, we discussed, we modified, we supplemented.
– The first goal was a base of 300 questions. We finished with a 10 times bigger base, and with – “I don’t want anyone to ask me a question :)”
Brainster: Rubik is still in the testing phase. But he is training hard to be available as soon as possible to anyone who is interested in Brainster Bootcamps. Interestingly, the future students of the Bootcamp will work on its continuous improvement. What do you expect from Rubik’s further development?
– We expect to improve the “response time” and to get an instant solution to common questions. Potential candidates should get a concrete answer to the first question, without any additional sub-questions. To expand the scope and re-train the model of already asked questions, and with that to increase its accuracy.
Brainster: Google is one of the companies that implement NLP in many of its products. Such as Google Assistant, Google Translate, Search, Autocorrect, and Targeted advertising. Therefore, this technology is very promising for anyone interested in a Data Science career. How competent do you feel to be working in this field after graduating from the Bootcamp?
– The participation in this project gives us a solid foundation for further work in this field. It has already aroused our curiosity for further research. A well-developed chatbot is a data source for further analysis. A field in which we feel comfortable considering our previous experience.
Brainster: Where do you see the end-game of these automation process technologies? Can NLP completely eliminate the human factor from services such as support and customer relationship, translation, and hiring & recruitment?
-The full potential of chatbots is yet to be discovered. As expected, future development will be in the software that will perform operations that normally are performed by humans. After the mass application in customer care, the sophisticated NLP algorithms will be applied intensively in the above-mentioned areas.
Brainster: Will this rise in automation, increases the demand for Data Scientists and Machine Learning Engineers?
-For several years now, Data Scientists and Machine Learning engineers have been high on the list of most popular professions. This trend is expected to continue, as automation processes have not even begun in some industries. For example, developing an intelligent chatbot requires an understanding of machine learning, AI and NLP technologies, knowledge of back-end programming, and various programming languages and technologies.
Brainster: Martina and Gabi, we talked to you once at the beginning of the year. You were finishing the Statistics module. You come from the financial sector. How would you summarize the second part of the program, that was dedicated to Python, Machine Learning, and Big Data?
– The second part of the program was the real challenge for us. Because in the first part we can say that we walked on a “familiar ground”. From writing the first code, through data processing, building predictive models to working with Big Data. And all that, complete with a project that will find its practical application.
Brainster: You are currently working on new projects with the Bootcamp students, as part of our Data Science Hub. What advice will you give to your colleagues? How can they easily cope with the responsibilities and challenges you have already done through the program?
– Crucial to the success of the Bootcamp was the huge desire to upgrade our knowledge and to have an open mind to new challenges. As well as the hard work and dedication through the past year. You should stay focused and make the weekly workshop enjoyable through teamwork.
Brainster: Finally, a question for the mentors. The 5th group of the Bootcamp is completed. How do you see the evolution we’ve gone through in the past year? Specifically, in terms of the program and the student experience from the online lectures?
– Each of the students gained more self-confidence in solving the tasks and the modules they listened, with each passing hour. I can say that each of them was impatiently waiting for the lecture to start. They were excited about the exercises that were followed and curious to find how they can solve the tasks.
The satisfaction comes with great results and a competitive spirit among colleagues. In one year, we manage to finish: 41 workshops, around 80 homework, several books read, and 4 solved projects. And there was not a single part of the students that were not dedicated to this challenge.
The program itself can be exhausting and at times requires motivation and this is done twice a year with each student separately. We gave feedback to each of the students if there is a need for improvement and on what they should be a focus on. At the same motivational conversations, we point the students if they are good at traditional data analysis, computer vision, text work, or predictions.
Follow the link to find more detailed information about the project that Team 2 worked on https://github.com/filipgd1/NLP-Chatbot
– The fear that everyone initially had, turns positively as all of the lectures are recorded and can be viewed. Any question that is asked by the colleagues is explained in detail. And if a part is omitted, we can return back to recall.
Normally, each of us is tired in a certain part of the year. But still, we study without stopping for 365 days :). So online lectures allow us to listen to them later. However, we insist on being present at the lectures, because the discussion is the fortune for every future Data Scientist.
It is a great feeling to see how students “overcome their fears”, how their interest and curiosity grow day by day. How much they are willing to put in the time and effort to cope with what they are doing. The colleagues managed to create a positive atmosphere filled with creativity, teamwork, and cooperation. I think that we as mentors have also learned a lot from the Bootcamps.
If you have ever thought about a future-proof career in Data Science, have a look at our remote Bootcamps.