← Go Back

Module 6: Named Entity Recognition and Text Classification

Named Entity Recognition (NER) is a natural language processing task that identifies and classifies named entities (like people, organizations, locations, dates) in text. For example, in "Apple released iPhone in 2007", NER would identify "Apple" as an organization, "iPhone" as a product, and "2007" as a date. Text classification assigns predefined categories/labels to text documents. Example: Categorizing emails as spam/not-spam, or news articles into topics like sports, politics, technology. We will use:

Module 6 will present different approaches to Named Entity Recognition/Extraction and Text Classification as well as methods to evaluate model outputs using Ground Truth.

NER using models via the Impresso API (Impresso Project)
Question-Answering based namend entiy extraction using the small extraction-tuned language model NuExtract
Text Classification comparing and evaluating a small (Qwen2.5-7b) and mid-size (Nemotron-70b) model.

Preparation for Module 6:

No preparation needed.

Notebooks we will use in class:

Named Entity Recognition Impresso API

QA based Entity Extraction

Text Classification

Workload (after class):

Schedule an individual appointment - as soon as you are ready to discuss your research project - with the course instructor (Sarah Oberbichler). Contact via E-Mail or Mattermost.

Date and Time:

January 10, 2025 (10:00 AM to 11:30 AM)