<% seo %>

Acquire, Extract, Analyze Web Data

Empowering you to make Data-Driven Decisions.

Open Source Datasets

Text Analysis

Broad Crawls

E-Commerce Data

Leverage Open Source Datasets

Process and leverage existing datasets for your business objectives:

Wikipedia, Yago, DbPedia for entity resolution
GDELT project for global event and news data
Geonames for resolving addresses to geolocations

Text and Data Analysis

Approximate Record matching of products, companies, people, etc.
Content Extraction from HTML, PDF and other documents
Natural Language Processing
Document clustering
Image Classification using Deep Learning networks
Parsing of postal addresses
Parsing of phone numbers
Indexing and querying thousands of websites
Website technology detection
Automatic extraction of tabular data

Custom Broad Crawls

Extract information from millions of websites and make it actionable and queriable.

We routinely crawl thousands or millions of websites, extracting either standard information, or custom details per use case.

contact data (postal address, phone, VAT, etc.)
high-level information about the company
industry classification
etc.

Extraction of Structured Data

Extract data from millions of structured websites (i.e. E-Commerce):

product name
manufacturer
SKU
price
etc.