Streamlining Data Research and Analysis with Natural Language Processing
|UN SDGs||01 No Poverty 02 Zero Hunger 10 Reduced Inequalities|
|Countries||Ethiopia Kenya Netherlands Rwanda Sierra Leone Tanzania, United Republic of|
|Technologies||AI, Node.js, PHP, Python, MySQL|
Laterite.ai is a cutting-edge technology-enabled platform designed to assist data research professionals in optimising their data analysis processes. Laterite, a renowned social impact research organisation with operations in The Netherlands and Africa, partnered with AndAnotherDay to develop an accessible web platform that leverages Natural Language Processing (NLP) techniques for faster research and analytics.
Like many industry peers, Laterite faced significant challenges in handling large volumes of information that required meticulous verification, correction, and logging. Data processing consumed substantial time, which could otherwise be utilised for more productive research activities. To address this issue, Laterite sought a web application that could streamline the data sanitisation process and provide data research professionals with tools for enhanced efficiency. Incorporating Machine Learning, particularly Large Language Models, was a key component of the solution, enabling the extraction of meaning, sentiment analysis, and deriving valuable insights from textual data.
The primary objective was to alleviate the repetitive aspects of research by equipping researchers with tools that enhance efficiency and foster innovative thinking. By streamlining the research process, the application aimed to empower researchers to maximise the social impact of their work.
The developed application offers researchers the ability to register, select a subscription tier, make online payments, and utilize allocated tokens to access various research apps. Tokens reset on a monthly basis and can be replenished by purchasing additional tokens online. The introduction of tokens ensured compliance with OpenAI’s pricing model, thereby managing costs associated with data processing.
The design approach centreed around simplicity, ensuring familiarity and ease of use for the target audience. The colour scheme adhered to Laterite’s corporate branding, complemented by a combination of AI-generated images and photographs from Africa. Page loading times were optimised, with minimal use of transitions and animations, favouring a flat user interface (UI) design.
The project adopted a waterfall approach, prioritising features for the Alpha release using the MoSCoW prioritisation method. Throughout the development process, additional apps were added to test the code’s robustness. This approach allowed for stress-testing the application, identifying areas for improvement as more apps were integrated.
The frontend implementation utilised PHP for its speed and reliability, while Node.js served as middleware to handle lengthy requests. Python was utilised to interface with AI technologies.
Development teams encountered several challenges related to the time required for AI to process information. Node.js was introduced to mitigate timeouts during data transactions that exceeded the duration a web server would keep a data connection open for packet return. This enabled users to either wait or return to the application while previously submitted requests were processed. To manage busy periods, request queuing was implemented, prioritising text-based requests for expedited processing, while file uploads were processed based on server resources availability.
The suite of automated tools leverages advanced large language models, including OpenAI’s GPT-3 solutions and NVIDIA’s NeMo LLMs. These models support diverse tasks such as coding assistance, error correction, survey creation, data classification, and topic modeling, among others.
Stripe integration facilitated secure payment processing for subscriptions and token purchases.
Security measures encompassed server-side storage of secure keys and implementation of rate limits to prevent abuse and spamming.
Although the simple design initially fulfilled the project requirements, it was conceived prior to the rise in popularity of AI, particularly ChatGPT. The next phase of the project will introduce stylistic changes to show the user exactly where there request is in the process and how long they may have to make and to give a similar experience to other AI platforms.
Throughout the platform’s development, the team implemented meticulous logging at every stage of the data journey. Due to the reliance on third-party applications that returned varying responses, precise issue detection at the specific time and location of occurrence became crucial.
The application was delivered with all the features needed for the alpha release.
We are anticipating the next phases of the project will introduce more apps by Laterite and other researchers, covering other media types beyond text, documents, and spreadsheets, to create reliable suite for tools to help researchers to do their jobs and in turn to improve the lives of communities across Africa and throughout the world