top | item 12354991

(no title)

peterjliu | 9 years ago

Author of post here. I'd say most of the examples generated from the best model were good. However we chose examples that were not too gruesome, as news can be :)

We encourage you to try the code and see for yourself.

discuss

probably_wrong|9 years ago

How does the model deal with dangling anaphora[1]? I wrote a summarizer for Spanish following a recent paper as a side project, and it looks as if I'll need a month of work to solve the issue.

[1] That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.

peterjliu|9 years ago

We're not "selecting" sentences as an extractive summarizer might. The sentences are generated.

As for how does the model deal with co-reference? There's no special logic for that.

wutbrodo|9 years ago

Wouldn't it suffice to do a coreference pass before extracting sentences? Obviously you'll compound coref errors with the errors in your main logic, but that seems somewhat unavoidable.

flamedoge|9 years ago

That is inter-sentence logic? Even humans have trouble with such ambiguity for certain cases.

wintom|9 years ago

In the post you mentioned that

>>"In those tasks training from scratch with this model architecture does not do as well as some other techniques we're researching, but it serves as a baseline."

Can you elaborate a little on that? Is the training the problem or is the model just not good at longer texts?

elyase|9 years ago

Any chance some trained model will be released?

plusepsilon|9 years ago

Any hints on how to integrate the whole document for summarization? ;)

I've seen copynet, where you do seq2seq but also have a copy mechanism to copy rare words from the source sentence to the target sentence.

fowlerpower|9 years ago

Is it hard to get the code up and running on Google Cloud? Does TensorFlow come as a service?