STAY CURRENT
CISO CORNER
Technology
CISO Events
White Papers
Partner Resources

keep your memories alive

STAY CURRENT
CISO CORNER
Technology
CISO Events
White Papers
Partner Resources

Home STAY CURRENTArticles New AI system to extract data from the internet

New AI system to extract data from the internet

by CIO AXIS November 16, 2016

CIO AXIS November 16, 2016

Share

0

Facebook Twitter Linkedin Whatsapp Email

Scientists have developed a new artificial intelligence system that can more effectively extract data from the vast wealth of information present on the internet.

The data necessary to answer myriad questions – about, say, the correlations between the industrial use of certain chemicals and incidents of disease, or between patterns of news coverage and voter-poll results -may all be online in form of plain text. However, extracting data from plain text and organising it for quantitative analysis may be prohibitively time consuming.

Researchers from Massachusetts Institute of Technology (MIT) in the US developed a new approach to information extraction. Most machine-learning systems work by combing through training examples and looking for patterns that correspond to classifications provided by human annotators.For instance, humans might label parts of speech in a set of texts, and the machine-learning system will try to identify patterns that resolve ambiguities – for instance, when “her” is a direct object and when it is an adjective. Typically, computer scientists will try to feed their machine-learning systems as much training data as possible. That generally increases the chances that a system will be able to handle difficult problems. In the new research, scientists trained their system on scanty data.

“In information extraction, traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract correctly from this article,” said Regina Barzilay, professor at MIT. A machine-learning system will generally assign each of its classifications a confidence score, which is a measure of the statistical likelihood that the classification is correct, given the patterns discerned in the training data.With the new system, if the confidence score is too low, the system automatically generates a web search query designed to pull up texts likely to contain the data it is trying to extract.It then attempts to extract the relevant data from one of the new texts and reconciles the results with those of its initial extraction. If the confidence score remains too low, it moves on to the next text pulled up by the search string, and so on.

Every decision the system makes is the result of machine learning. The system learns how to generate search queries, gauge the likelihood that a new text is relevant to its extraction task, and determine the best strategy for fusing the results of multiple attempts at extraction. The researchers compared their system’s performance to that of several extractor trained using more conventional machine-learning techniques. For every data item extracted in both tasks, the new system outperformed its predecessors, usually by about 10 per cent.

AI system Internet

Recommended for You

Analyst Report

SecureX at a glance: Reduce complexity with a built-in platform experience

Click to Download
Whitepaper

SecOps Infographic: Let’s Simplify Your Incident Response Workflow!

Click to Download
Whitepaper

SecureX Infographic: Simplify Your Security

Click to Download

Recommended for You

How CIOs and CISOs Can Govern AI Without...

December 22, 2025

From Cost-Center to Business Enabler: Unlocking the strategic...

November 27, 2025

Rising Attack: The Silent Saboteur Threatening AI Vision

July 25, 2025

Safer Internet Day – Protecting yourself and others...

February 11, 2025

Countering Ransomware and APT’s A Modern Approach to...

December 31, 2024

How to Choose the Right Cybersecurity Solution, while...

September 19, 2024

Viewpoints

Cybersecurity Predictions for 2023 from some of the...

Top 5 Challenges for CISOs in the Current...

Why Managing Segregation of Duties is Insufficient in...

CISO Bytes

“Cyber Security is a continuous journey. Hackers only...

Risk Assessments are Forever! – Gokulavan Jayaraman –...

Interviews

Interview of Rajnish Gupta, Managing Director & Country...

Interview with Sundar Balasubramanian, Managing Director, Check Point...

Interview with Lalit Trivedi, Head – Information Security,...

Interview with John Joseph, Director – Cybersecurity, Perceptive...

Interview with Munish Gupta, President & Global Head...

My Page

Manoj Pradhan, Head – IT Infrastructure, Pernod Ricard India

Manoj Pradhan

Head – IT Infrastructure, Pernod Ricard India
Deval Mazmudar, CISO, IndusInd Bank

Deval Mazmudar

CISO, IndusInd Bank
Sanjay Gogia, Assistant Vice President – IT, Aricent

Sanjay Gogia

Assistant Vice President – IT, Aricent
Vision to Improve Girls’ Literacy Rates in India

CISO Movements

Dharmendra Kava Appointed as CISO at JioBlackRock Mutual Fund
Sridhar Govardhan Appointed as Group CISO of Angel One
Harish Arora Appointed as the CISO and DPO at Singhi & Co.
Arnab Biswas Joins BOBCARD as Chief Information Security Officer
Praveen Parihar Appointed as Chief Information Security Officer at Razorpay

Facebook
Twitter
Instagram
Pinterest

About Us
Advertise with Us
Submit Press Release
Events & Webcasts
Privacy Policy
Contact Us

Reproduction in whole or in part in any form or medium without written permission of BitStream Mediaworks Pvt Ltd is strictly prohibited.

footer banner

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Close Read More

See Ads

Please wait...

Get the CISO Connect Daily Newsletter

News, analysis, how-to, tips and tricks and practical advice for the C-suite executive

CLICK TO SUBSCRIBE