Wednesday, Apr 24, 2024

Engineering / IT

OIT Graduates Develop Tagging Technology to Help Digital Publications

November 01, 2015

A team of Olivet Institute of Technology graduates worked with Olivet University Research and Development Center to develop new text taxonomy and tagging technology that would help online newspapers categorize their contents more efficiently.

Traditionally for each article, news editors need to manually choose or write tags (keywords), attach those tags to the article’s metadata, and then place the article in the appropriate categories.

“It is tremendous work,” said a member of the team. “We saw the burden on editors, who often have hundreds of pieces to review and want to publish them as soon as possible.”

With assistance from OU’s R&D Center, OIT graduates developed a Named Entity Recognizer to extract entities such as people, organizations and locations, from articles. The team also researched Natural Language Processing and machine learning algorithms.

“We studied hundreds of thousands of news articles on the Web,” said another member of the team. “Thanks to many open source projects and data such as NLTK, Stanford NER, and Linked Open Data, we can train our identifier and examine the results.”

More than 20,000 articles and 10,000 topics and tags were included in the training dataset.

The technology is still in its beta stage.

Events / Calendar
  • Apr 24

    Last day to apply for Spring quarter Leave of Absence

  • May 24

    Memorial Day (Holiday)

  • Jun 10

    Registration begins

  • Jun 12

    Last day of classes

  • Jun 13

    Final exam period(June 13 - 19)