LunaNotes

Download Subtitles for Health Care Data Analytics Lecture B

Health Care Data Analytics: Unit 6: Machine Learning and Natural Language Processing - Lecture B

Health Care Data Analytics: Unit 6: Machine Learning and Natural Language Processing - Lecture B

Dr Chris Paton - Digital Health, Informatics & AI

530 segments EN

SRT - Most compatible format for video players (VLC, media players, video editors)

VTT - Web Video Text Tracks for HTML5 video and browsers

TXT - Plain text with timestamps for easy reading and editing

Subtitle Preview

Scroll to view all subtitles

[00:00]

welcome to healthcare data analytics

[00:03]

machine learning and natural language

[00:05]

processing

[00:06]

this is lecture b the component

[00:09]

healthcare data analytics

[00:11]

covers the topic of healthcare data

[00:13]

analytics which applies the use of data

[00:15]

statistical and quantitative analysis

[00:18]

and explanatory and predictive models to

[00:20]

drive decisions and actions in health

[00:22]

care

[00:23]

the learning objectives for this unit

[00:25]

machine learning and natural language

[00:27]

processing are to

[00:29]

describe the major tasks for which

[00:31]

machine learning is used

[00:33]

compare and contrast the major

[00:35]

approaches for machine learning

[00:37]

describe the major tasks for which

[00:39]

natural language processing is used

[00:42]

and discuss the major approaches and

[00:44]

challenges for processing clinical

[00:45]

narratives

[00:47]

in this lecture we begin our discussion

[00:49]

of natural language processing or nlp of

[00:52]

clinical text

[00:54]

first we'll look at basic definitions

[00:56]

and approaches to nlp

[00:58]

this will be followed by challenges in

[01:00]

processing the clinical narrative

[01:03]

in the next lecture we'll discuss

[01:05]

various clinical nlp approaches and

[01:07]

projects

[01:08]

and finally we'll describe alternatives

[01:11]

and future directions

[01:12]

let's begin with basic definitions and

[01:15]

approaches

[01:16]

successful nlp of the clinical narrative

[01:19]

could help better enable the use of data

[01:21]

in electronic health records

[01:23]

or ehrs we know for example

[01:26]

that current coded data such as icd-10

[01:29]

does not cover the complexity of what's

[01:31]

described in the clinical narrative

[01:34]

we also know that a good deal of

[01:36]

clinical information

[01:37]

is locked in that text meaning we cannot

[01:40]

easily extract and process the

[01:41]

information to use for various purposes

[01:44]

some have noted that the term nlp could

[01:47]

actually be better described as natural

[01:49]

language understanding

[01:51]

because the goal of nlp is the

[01:53]

understanding of natural language in

[01:55]

computerized text

[01:57]

for those desiring more detail on the

[01:59]

various approaches to nlp

[02:01]

and its uses the references given in the

[02:03]

last few slides of this presentation can

[02:06]

be consulted

[02:07]

what are some of the use cases for

[02:09]

clinical nlp

[02:11]

the three major ones are listed on this

[02:13]

slide the first use case is

[02:15]

classification

[02:16]

where we're trying to classify what we

[02:18]

find in the text into some sort of

[02:20]

category for example

[02:23]

we may want to classify a patient

[02:24]

finding into a category

[02:26]

such as when determining if they might

[02:28]

be eligible for a clinical study

[02:30]

probably the major use case for clinical

[02:33]

nlp

[02:33]

is extraction where we want to extract

[02:36]

information from a clinical narrative

[02:39]

for example we might want to extract the

[02:41]

findings that occur in a radiology

[02:43]

report

[02:44]

and even the measurements that are

[02:46]

reported within that text

[02:48]

a third use case is summarization where

[02:51]

we may want to summarize or

[02:52]

abstract the information that's in the

[02:54]

narrative

[02:55]

we may do this for medical literature to

[02:58]

summarize scientific information

[03:00]

or the clinical narrative where we're

[03:02]

trying to summarize the major findings

[03:04]

that the patient has

[03:05]

we can delve further into use cases for

[03:08]

nlp

[03:08]

by considering cancer care this set of

[03:11]

use cases comes from some promotional

[03:13]

literature from a company that sells

[03:15]

clinical nlp

[03:16]

products but actually gives a good set

[03:19]

of use cases for which nlp might help us

[03:22]

for example we might identify potential

[03:25]

clinicals trials matches

[03:27]

something akin to what we mentioned on

[03:28]

the last slide

[03:30]

we might be able to do advanced

[03:32]

information extraction from complex

[03:34]

patient documents

[03:35]

we may be able to carry out more precise

[03:37]

information retrieval for clinical case

[03:39]

histories and outcome studies

[03:42]

we also may be able to streamline the

[03:44]

process of entering patients into

[03:46]

cancer registries in addition we may be

[03:49]

able to use the data that we extract

[03:51]

using nlp

[03:52]

to apply predictive models and care

[03:54]

coordination rules to clinical

[03:56]

narratives in the patient record

[03:58]

we may also be able to perform semantic

[04:00]

enrichment of patient documentation to

[04:03]

improve the ability to search their

[04:04]

nodes

[04:05]

we can analyze patient narratives for

[04:08]

insights into treatment outcomes

[04:10]

and also to assess the effect of genetic

[04:12]

aberrations on disease

[04:14]

finally we may be able to support tumor

[04:16]

boards

[04:17]

where the care of patients who developed

[04:19]

cancer is discussed by those providing

[04:21]

care for them

[04:22]

let's take a more detailed look at human

[04:25]

language so we can understand the

[04:27]

applications and limitations of nlp

[04:29]

tools and clinical documents

[04:32]

linguists talk about the levels of human

[04:34]

language

[04:35]

we begin with phonology the sound units

[04:38]

that make up a language's discrete

[04:40]

sounds

[04:40]

are called phonemes next level up is

[04:43]

morphology

[04:44]

which is the analysis of parts of words

[04:47]

which are called morphemes

[04:49]

sometimes a whole word is the morpheme

[04:51]

but other times they may be bound

[04:53]

morphemes that are part of the word

[04:56]

for example many anatomic locations such

[04:59]

as the appendix

[05:00]

are bound to another word such as itis

[05:03]

indicating inflammation

[05:05]

thus appendicitis indicates there's

[05:07]

inflammation of the appendix

[05:09]

there are other morphemes such as

[05:11]

fairing and itis

[05:13]

there are also bound morphemes that

[05:15]

indicate procedures such as an

[05:17]

appendectomy

[05:18]

syntax refers to the rules that govern

[05:20]

the construction of language

[05:22]

sometimes called the grammar semantics

[05:25]

describes the meanings of the words

[05:27]

phrases and the sentences that make up

[05:29]

language

[05:30]

linguists also talk about pragmatics

[05:33]

which is how the context of language

[05:35]

affects its meaning

[05:36]

and then there's the larger world

[05:38]

knowledge that's not explicitly part of

[05:40]

language

[05:41]

but is the general knowledge that's

[05:42]

necessary to understand it

[05:44]

the classic approach to nlp goes through

[05:47]

three phases

[05:48]

the first phase is syntax where we

[05:51]

attempt to recognize the grammatical

[05:53]

constituents of language

[05:55]

sentences phrases within them and down

[05:57]

to nouns

[05:58]

verbs adjectives etc the next phase is

[06:02]

semantics

[06:03]

where we attempt to recognize the

[06:05]

meaning of those words

[06:06]

phrases and sentences finally is the

[06:09]

context in which the sentence occurs

[06:11]

each of these levels is successively

[06:13]

harder and requires more knowledge

[06:15]

engineering

[06:16]

but would add more value if we could

[06:18]

solve those problems

[06:20]

one of the ways we address the inability

[06:22]

to completely perform classic nlp

[06:24]

is through the use of rules and matching

[06:27]

where we don't aim for complete

[06:28]

understanding of everything in the

[06:30]

document

[06:31]

but instead try to recognize the terms

[06:33]

that occur and perhaps normalize them

[06:36]

this may allow us to understand what was

[06:38]

said or

[06:39]

instead of using detailed grammar rules

[06:41]

we may use machine learning techniques

[06:43]

where we learn the rules of parsing

[06:45]

rather than developing human

[06:47]

enumerations of all the possible grammar

[06:49]

rules that might exist

[06:51]

let's explore syntax and semantics in

[06:53]

more detail

[06:54]

processing of syntax is usually done via

[06:57]

a technique called parsing

[06:59]

this requires a grammar which is the

[07:01]

rules that govern the syntax of language

[07:04]

the most common way that we express a

[07:06]

grammar is as a set of rewrite rules

[07:09]

where a more complex grammatical

[07:11]

construct is rewritten from constituent

[07:13]

parts

[07:14]

for example a relatively simple

[07:17]

grammatical rule

[07:18]

is that a sentence consists of a noun

[07:20]

phrase a verb phrase

[07:22]

and a noun phrase an example is

[07:25]

the patient has severe hypertension the

[07:28]

first

[07:29]

noun phrase is the patient the verb

[07:31]

phrase is the verb has

[07:34]

the second noun phrase is severe

[07:36]

hypertension

[07:37]

of course noun phrases themselves can be

[07:40]

rewritten into more basic constituents

[07:43]

there are determiners such as a

[07:44]

grammatical article an example of which

[07:47]

is the word

[07:48]

the there are also adjectives such as

[07:51]

severe

[07:52]

and a noun phrase can also just consist

[07:54]

of a single noun

[07:56]

the symbols that cannot be further

[07:58]

decomposed such as an adjective and a

[08:00]

noun

[08:01]

are called terminal symbols likewise

[08:04]

those that can be further decomposed

[08:06]

such as sentence and noun phrase are

[08:08]

called non-terminal symbols

[08:10]

as you can imagine the grammar

[08:12]

supporting the english language can get

[08:14]

highly complex

[08:15]

with many many rewrite rules this is why

[08:19]

the machine learning approach has

[08:20]

superseded the approach of trying to

[08:22]

enumerate

[08:23]

every last grammar rule in semantics

[08:26]

we aim to map the parts of speech these

[08:29]

nouns

[08:30]

adjectives verbs etc into standardized

[08:32]

terminology

[08:34]

for medicine probably the most

[08:36]

descriptive terminology

[08:37]

is snomed ct processing language has

[08:40]

been one of the most challenging

[08:42]

computer tasks

[08:43]

and is difficult not only in the

[08:45]

clinical narrative but almost in all

[08:47]

forms of natural language

[08:49]

clinical narratives such as progress

[08:52]

notes and discharge summaries

[08:53]

can be even more difficult to process

[08:55]

than other types of text for many

[08:57]

reasons

[08:58]

one is that clinical narratives are

[09:00]

written in a telegraphic

[09:01]

elliptical style oftentimes the

[09:04]

narratives are not complete sentences

[09:07]

we'll see examples of that in a moment

[09:09]

clinical text

[09:10]

also may have spelling errors or

[09:12]

grammatical errors

[09:14]

we also know that physicians and others

[09:16]

may take license with language

[09:18]

and oftentimes there may be important

[09:20]

information that's buried within

[09:22]

normal language that's implicit but not

[09:24]

actually in the words and phrases

[09:27]

we'll look at some of the challenges at

[09:28]

the syntactic semantic

[09:30]

and contextual levels here's a look at

[09:33]

some of the syntactic challenges that

[09:34]

were first enumerated by sager in the

[09:36]

1980s

[09:38]

others have since validated these

[09:40]

challenges as mentioned in the previous

[09:42]

slide

[09:43]

a great deal of clinical narrative text

[09:45]

is syntactically incomplete

[09:47]

that is at least according to sager's

[09:49]

analysis

[09:50]

half of all sentences in the clinical

[09:52]

narrative were found to be grammatically

[09:54]

incomplete

[09:56]

if we think of the minimal english

[09:57]

sentence as subject-verb object

[10:00]

we see different types of incomplete

[10:02]

sentences

[10:03]

for example the medical record may

[10:06]

delete the verb and object

[10:08]

when the text says stiff neck and fever

[10:11]

there has been a deletion of the verb

[10:12]

and object from the sentence

[10:14]

in brain scan negative there's deletion

[10:17]

of the verb is

[10:19]

for the statement positive for heart

[10:21]

disease there is deletion of the subject

[10:23]

and verb

[10:24]

such as the patient has and finally

[10:28]

was seen by local doctor has deletion of

[10:30]

the subject

[10:32]

as humans we can read these and still

[10:34]

for the most part

[10:35]

understand what's happening but computer

[10:37]

algorithms

[10:38]

especially those that are solely based

[10:40]

on rules have difficulty with these

[10:43]

sorts of violations of rules of basic

[10:45]

english grammar

[10:46]

there are also semantic challenges which

[10:49]

again

[10:49]

as humans especially those who have some

[10:52]

clinical knowledge we readily understand

[10:55]

but to a computer that's just

[10:57]

functioning based on rules

[10:58]

there's a lot more difficulty we know

[11:01]

that words have different senses and

[11:03]

meanings

[11:04]

for example when we read in a medical

[11:06]

chart murmur is appreciated

[11:08]

we know that likely there's a clinician

[11:10]

who's listening probably with a

[11:12]

stethoscope to the heart and there's a

[11:14]

murmur

[11:15]

it's not so much that the murmur is

[11:17]

appreciated in the sense of it being

[11:19]

liked

[11:20]

by the same token when we read about eye

[11:23]

drops

[11:23]

we're thinking about drops of liquid

[11:25]

containing medication put into the eye

[11:28]

and not the eye physically dropping

[11:30]

likewise when we read

[11:32]

mass at three o'clock we know that we're

[11:34]

likely reading about something that's

[11:36]

felt on the left-hand side of the

[11:37]

abdomen

[11:38]

and not that there's a religious service

[11:40]

in the afternoon

[11:42]

another semantic challenge is synonymy

[11:44]

where different words and phrases have

[11:46]

the same meaning but they're expressed

[11:48]

differently

[11:49]

for example consider the phrase

[11:52]

epigastric pain

[11:53]

after eating versus another phrase

[11:55]

postprandial stomach discomforts

[11:58]

these two phrases have no words in

[12:00]

common but essentially mean the same

[12:02]

thing

[12:04]

there is also polysemi where the same

[12:06]

words and phrases have different

[12:08]

meanings

[12:09]

for example someone might say the pcp of

[12:12]

the patient with pcp

[12:14]

advised him to stop using pcp

[12:17]

pcp is an acronym that stands for

[12:20]

several things

[12:21]

such as primary care physician

[12:23]

pneumocystis carini pneumonia

[12:26]

or an abbreviated name for the drug

[12:28]

fincyclidine

[12:29]

there are a number of additional

[12:30]

semantic challenges

[12:32]

one is negation the clinical narrative

[12:35]

is often full of negation

[12:37]

clinicians may say the patient does not

[12:39]

have this finding or that finding

[12:41]

or that this disease is not present or

[12:44]

saying we're choosing not to use this

[12:46]

treatment

[12:47]

and instead are using another one

[12:49]

negation is common in medical text

[12:52]

for example patient does not have any

[12:54]

chest pain

[12:56]

there is also uncertainty in natural

[12:58]

language text

[12:59]

clinicians may say things like patient

[13:01]

treated for possible pneumonia

[13:04]

there is also temporality just because

[13:07]

something is mentioned

[13:08]

doesn't mean that it's present now for

[13:10]

example

[13:11]

patient has history of pneumonia or

[13:14]

there might be something that's been

[13:15]

resolved

[13:16]

such as chest pain resolved after

[13:19]

administration of nitroglycerin

[13:21]

there are also contextual challenges in

[13:23]

the clinical narrative

[13:25]

the term that describes a broad category

[13:27]

of these is co-reference

[13:29]

which is the relation between linguistic

[13:31]

expressions that refer to the same real

[13:33]

world entity

[13:34]

consider the sentence chest x-ray shows

[13:37]

nodule in left upper lobe

[13:40]

followed by another sentence the tumor

[13:42]

has increased in size to two centimeters

[13:45]

the phrase the tumor from the second

[13:47]

sentence is actually referring to that

[13:49]

same nodule from the first sentence

[13:52]

there's a particular type of

[13:53]

co-reference that can be challenging

[13:55]

which is anaphora or the use of pronouns

[13:58]

consider these two sentences he

[14:01]

complains of

[14:02]

chest pain it awakens him at night

[14:05]

it in the second sentence refers to

[14:07]

chest pain in the first sentence

[14:09]

there's another type of contextual

[14:11]

challenge where there's the deletion of

[14:12]

subjects

[14:13]

this is quite common in clinical

[14:15]

narratives so we may see strings of

[14:17]

sentences such as

[14:19]

complains of chest pain increasing

[14:22]

frequency

[14:23]

worse in the morning again as human

[14:26]

readers

[14:26]

we usually understand that quite easily

[14:29]

but when we have a natural language

[14:31]

processing system

[14:32]

the computer may not make the

[14:34]

connections across the sentences

[14:36]

are there any silver linings that may

[14:38]

enable us to have hope that we can carry

[14:40]

out clinical nlp

[14:42]

it turns out that there are first is the

[14:44]

notion of sub-grammars

[14:46]

after work by sager in medicine and

[14:48]

other disciplines

[14:49]

she determined that there were

[14:51]

sub-grammars that were grammars that

[14:53]

were specific to disciplines

[14:55]

and that there was a sub-grammar of

[14:56]

clinical narratives that were actually

[14:58]

fairly regular and predictable

[15:01]

another finding is that medical charts

[15:03]

tend to have a predictable discourse

[15:05]

especially documents like the history

[15:07]

and physical

[15:08]

where the document begins with the

[15:10]

history of the patient

[15:12]

goes into the past medical history and

[15:14]

then into the physical exam

[15:16]

physicians for the most part follow a

[15:18]

well-prescribed pathway through the exam

[15:22]

more recently another silver lining has

[15:24]

been

[15:25]

that we should abandon the notion of

[15:26]

processing the entire clinical narrative

[15:29]

and instead focus on specific elements

[15:32]

that we can identify to indicate whether

[15:34]

or not a specific disease

[15:36]

or specific clinical finding is present

[15:39]

thus giving up on the approach of

[15:40]

processing everything

[15:42]

and instead focusing on specific

[15:44]

elements present

[15:45]

before we look at usage of clinical nlp

[15:48]

and systems for it in the next lecture

[15:50]

let's talk briefly about how we evaluate

[15:53]

how well nlp systems work

[15:56]

there are a variety of ways that systems

[15:58]

can be measured

[15:59]

but basically we want to determine how

[16:01]

well they identify

[16:02]

correct concepts and how well they don't

[16:04]

identify incorrect concepts

[16:07]

the measures that we typically use are

[16:09]

recall and precision

[16:11]

recall is the proportion of correct

[16:13]

concepts found

[16:15]

for example if there are 100 concepts

[16:17]

that should be found by an nlp system

[16:20]

and 75 actually are found then the

[16:23]

recall is 75 percent

[16:25]

precision is the proportion of found

[16:27]

concepts that are correct

[16:29]

so if we identify 150 concepts

[16:32]

and 75 of them are correct then our

[16:34]

precision is 50

[16:36]

percent many evaluations of nlp

[16:39]

are carried out in so-called challenge

[16:41]

evaluations

[16:42]

where there is a common data set that

[16:44]

different researchers use

[16:46]

these different research groups will

[16:48]

compare the results on the same

[16:50]

task for the clinical nlp community

[16:53]

the largest and most participatory

[16:55]

challenge evaluation

[16:56]

has been the i2b2 nlp shared task

[17:00]

there has also been a systematic review

[17:02]

of all studies through 2010 that was

[17:04]

published

[17:05]

and will be described more in the next

[17:07]

lecture this concludes lecture b

[17:09]

of machine learning and natural language

[17:11]

processing

[17:13]

in summarizing this lecture we learned

[17:15]

the major use cases for nlp

[17:18]

are classification extraction and

[17:20]

summarization

[17:22]

the major phases of nlp are syntax

[17:25]

semantics and context each of which has

[17:28]

challenges

[17:29]

and is successively harder to do with

[17:31]

computers and

[17:32]

there may be some silver linings to help

[17:34]

with nlp

[17:35]

such as sub grammars predictable

[17:37]

discourse and

[17:39]

focus on processing less than the entire

[17:41]

meaning of everything in the document

[18:08]

you

Download Subtitles

These subtitles were extracted using the Free YouTube Subtitle Downloader by LunaNotes.

Download more subtitles

Related Videos

Download Subtitles for Health Care Data Analytics Lecture C

Download Subtitles for Health Care Data Analytics Lecture C

Access accurate subtitles for 'Health Care Data Analytics: Unit 6 - Machine Learning and Natural Language Processing' to enhance your understanding. Perfect for learners who want to follow along easily or review complex concepts at their own pace.

Download Subtitles for Lesson 1: Understanding Healthcare Time Crisis

Download Subtitles for Lesson 1: Understanding Healthcare Time Crisis

Enhance your learning experience by downloading accurate subtitles for Lesson 1 - Understanding the Time Crisis in Healthcare. Subtitles improve comprehension and accessibility, making it easier to follow complex healthcare topics. Perfect for students and professionals seeking deeper insight.

Download Subtitles for All Machine Learning Concepts Video

Download Subtitles for All Machine Learning Concepts Video

Enhance your understanding by downloading accurate subtitles for the 'All Machine Learning Concepts Explained in 22 Minutes' video. Access clear captions to follow complex topics with ease and improve your learning experience.

MASTERCLASS 2026 Subtitles Download - June 11 Session Captions

MASTERCLASS 2026 Subtitles Download - June 11 Session Captions

Download accurate subtitles for the MASTERCLASS held on June 11, 2026, to enhance your learning experience. Access clear captions that help you follow the session easily and revisit key points anytime. Improve comprehension and accessibility with our high-quality subtitles.

Download Subtitles for Harvard CS50 2026 Computer Science Course

Download Subtitles for Harvard CS50 2026 Computer Science Course

Enhance your learning experience with downloadable subtitles for the Harvard CS50 2026 full computer science course. Easily follow along with lectures, improve comprehension, and access the content offline anytime. Perfect for students and enthusiasts aiming to master computer science concepts.

Most Viewed

Untertitel für 'Nicos Weg' Deutsch lernen A1 Film herunterladen

Untertitel für 'Nicos Weg' Deutsch lernen A1 Film herunterladen

Laden Sie die Untertitel für den gesamten Film 'Nicos Weg' herunter, um Ihr Deutschlernen auf A1 Niveau zu unterstützen. Untertitel helfen Ihnen, Wortschatz und Aussprache besser zu verstehen und verbessern das Hörverständnis effektiv.

ดาวน์โหลดซับไตเติ้ล DMD LAND 3 The Final Land Day 1

ดาวน์โหลดซับไตเติ้ล DMD LAND 3 The Final Land Day 1

ดาวน์โหลดซับไตเติ้ลสำหรับวิดีโอ DMD LAND 3 The Final Land Day 1 เพื่อช่วยให้เข้าใจเนื้อหาได้ง่ายขึ้น และเพิ่มความสะดวกในการติดตามทุกช่วงเวลา เหมาะสำหรับผู้ชมที่ต้องการความชัดเจนและเข้าถึงข้อมูลอย่างครบถ้วน

Descarga Subtítulos para NARCISISMO | 6 DE COPAS - Episodio 63

Descarga Subtítulos para NARCISISMO | 6 DE COPAS - Episodio 63

Accede fácilmente a los subtítulos del episodio 63 de '6 DE COPAS', centrado en el narcisismo. Descargar estos subtítulos te ayudará a entender mejor el contenido y mejorar la experiencia de visualización.

Subtítulos para TIPOS DE APEGO | 6 DE COPAS Episodio 56

Subtítulos para TIPOS DE APEGO | 6 DE COPAS Episodio 56

Descarga los subtítulos para el episodio 56 de la tercera temporada de 6 DE COPAS, centrado en los tipos de apego. Mejora tu comprensión y disfruta del contenido en detalle con nuestros subtítulos precisos y accesibles.

Download Subtitles for Your Favorite Videos Easily

Download Subtitles for Your Favorite Videos Easily

Enhance your video watching experience by downloading accurate subtitles and captions. Enjoy better understanding, accessibility, and language support for all your favorite videos.

Buy us a coffee

If you found these subtitles useful, consider buying us a coffee. It would help us a lot!

Let's Try!

Start Taking Better Notes Today with LunaNotes!