条件随机场

理解条件随机场

Submitted by huzhenda on Sat, 07/14/2018 - 11:47

        条件随机场(conditional random fields,简称 CRF),是一种判别式概率模型,是给定一组输入序列条件下另一组输出序列的条件概率分布模型,常用于标注或分析序列资料。

1、哪些问题需要用到CRF模型

         我们以自然语言处理中的词性标注(POS Tagging)作为例子。词性标注的目标是给出一个句子中每个词的词性(名词,动词,形容词等)。而这些词的词性往往和上下文词的词性有关,因此,使用CRF来处理是很适合的。

2、从随机场到马尔可夫随机场

        首先,我们来介绍随机场。随机场是由若干个位置组成的整体,当给每一个位置中按照某种分布随机赋予一个值之后,其全体就叫做随机场。以词性标注为例:假如我们需要对一个包含十个词的句子做词性标注。这十个词每个词的词性可以在我们已知的词性集合(名词,动词...)中选择。当我们为每个词选择完词性后,这就形成了一个随机场。

Introduction to Conditional Random Fields

Submitted by shiwenbin on Wed, 08/02/2017 - 09:22
Imagine you have a sequence of snapshots from a day in Justin Bieber’s life, and you want to label each image with the activity it represents (eating, sleeping, driving, etc.). How can you do this? One way is to ignore the sequential nature of the snapshots, and build a per-image classifier. For example, given a month’s worth of labeled snapshots, you might learn that dark images taken at 6am tend to be about sleeping, images with lots of bright colors tend to be about dancing, images of cars are about driving, and so on. By ignoring this sequential aspect, however, you lose a lot of information. For example, what happens if you see a close-up picture of a mouth – is it about singing or eating? If you know that the previous image is a picture of Justin Bieber eating or cooking, then it’s more likely this picture is about eating; if, however, the previous image contains Justin Bieber singing or dancing, then this one probably shows him singing as well. Thus, to increase the accuracy of our labeler, we should incorporate the labels of nearby photos, and this is precisely what a conditional random field does.