Trusted by data scientists and data engineers
On-the-fly data source optimization for ML models:
If properly prompted with context from all relevant external data, an LLM significantly improves the quality of its embeddings for text data in a source.
Open Street Map is an example of graph data source
Thus, if multiple sources with different error distributions are used, their ensemble will have better accuracy. This is similar to a consensus forecast.
If it finds the relevant information, it will automatically add a new search key - in this case, the postal code for each IP. This enables searching through all geo data sources in addition to IP sources.
Large Language Models (LLMs) are capable of recognizing, summarizing, translating, predicting, and generating text. One of the most popular application of LLM is ChatGPT.
Several LLMs integrated into Upgini data search to improve the accuracy of ML models.
Upgini enriches input texts with contextual information from external data sources, instructs LLM based on a context, and LLM generate more accurate embeddings from a combination of initial text, contextual information and generated text.
Upgini automatically generates optimized embeddings using LLM's external data augmentation for text in both connected data sources and training datasets for search.
Just launch data search using Upgini and your labeled training dataset with text columns, and Upgini will generate LLMs embeddings from text columns and check it for predictive power for you ML task.
Finally Upgini will return you dataset enriched by relevant only components of LLMs embeddings.
Raw string from a training dataset:
The Nook
Description generated by LLM without augmentation from external data:
The Nook is a line of e-readers and tablets produced by Barnes & Noble...
Description generated by LLM with augmentation from external data sources and advanced instructions:
The Nook is a tattoo shop located in Jefferson City, Missouri. The shop is known for....
Air temperature
Precipitation
Wind
Air pressure
Normals
Sun hours
Moon phase
POI Categories:
Schools, restaurants, hotels, supermarkets, etc
Houses:
Living buldings, business centers, etc
Transport infrustructure:
Roads, public transport stops, etc
Public facilities:
Gov. offices, post office, police, etc
Natural features:
Public parks, green areas, etc
Stats for different distances (1 km / 3 km / 5 km)
Workweek calendars by countries
Public holidays / Observed holidays
Religious holidays
Sporting events
Political events
Consumer Price index
GDP
Сentral Bank Rates
Сommodities prices
Stock prices
Stock volumes
Currencies and exchange rates
Market indexes
Step by step guide
#1
#2
#3
#4