Data for Machine Learning Gets a New Lease on Language

InVimage by Venga Global creates clean data sets for annotation of text within images
 
SAN FRANCISCO - Jan. 9, 2020 - PRLog -- Venga has released its third solution in its growing suite of products for Natural Language Processing (NLP) Data Collection. The new addition to the family is InVimage – a cloud-based solution for annotating text within images. With each annotation, we automatically capture the X and Y coordinates, OCR (Optical Character Recognition) the annotation and have the option to machine or manually translate the captured text.

Through our Human-in-the-Loop step, both the OCR and translation text can be reviewed and edited. This ensures Venga's clients receive the cleanest datasets possible for their training models. InVimage was built with scalability in mind and can handle hundreds of thousands of images daily.

At the beginning of 2019, Venga released a redesigned version of our first solution InVtext, a solution that eliminates many of the quality issues that have plagued data set text translation. This was followed shortly by  InVvoice that summer which simplified the management and translation of voice data.

"It has been a busy year for us", says Chris Phillips (COO). "Our data collection work has grown exponentially and we have had to scale our supply chain and development efforts accordingly. We've built those tools into a platform offering more options for our clients and resources."

Venga started working on data collection projects back in 2017. Some of the larger data collection buyers were not getting the improvements expected in their models from other providers and wanted to test Venga is this field. Venga is the first to admit it wasn't smooth sailing and suffered from delivery issues early on but learned very quickly and overcame many of the challenges that were causing models to stagnate in their development.

"When working with 130 language pairs and well over 1,000 linguists, your processes must be on point. If technology is causing problems, people get frustrated, pull out from projects, or find ways to cut corners.

We gathered our internal expertise and went through every pain point with a fine comb. Then we designed solutions that were up to the tasks. Our systems have been put to the test with the volumes we have processed this year and we haven't had a single complaint or drop out due to technology." continued Chris.

"We have now succeeded in delivering the key points for large scale NLP data collection. These include data integrity, adherence to researchers' rules, working with low resource languages, simple UX, and ability to work in the cloud with low internet speeds.

Our current solutions are tried and tested so now we can focus on customizations and can move development resources onto other interesting projects for our clients."

www.vengaglobal.com
End
Source: » Follow
Email:***@vengaglobal.com Email Verified
Tags:Machine Learning
Industry:Technology
Location:San Francisco - California - United States
Account Email Address Verified     Account Phone Number Verified     Disclaimer     Report Abuse
Venga Global News
Trending
Most Viewed
Daily News



Like PRLog?
9K2K1K
Click to Share