Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] ULMFiT for Text Classification #168

Closed
wants to merge 21 commits into from

Conversation

ComputerMaestro
Copy link
Contributor

@ComputerMaestro ComputerMaestro commented Aug 17, 2019

Universal Language Model Fine-tuning for Text Classification. Here the model will be used for sentiment analysis as of now.

  • Files
  • Language Model
  • Binary Sentiment Analyzer
  • Docs
  • Tests
  • TextAnalysis.jl way (Documents)
  • Question classifier
  • AG news

src/ULMFiT/custom_layers.jl Outdated Show resolved Hide resolved
@ComputerMaestro
Copy link
Contributor Author

@aviks did you upload the weights for which I gave the link?? if not I have different weights for Language model

@aviks
Copy link
Member

aviks commented Aug 21, 2019 via email

@ComputerMaestro
Copy link
Contributor Author

Can you please give me that link??

@aviks
Copy link
Member

aviks commented Aug 23, 2019

No actually, sorry, I was confused. I did not upload the weights for ULMFiT. Let me know when they are ready, and I will upload.

@ComputerMaestro
Copy link
Contributor Author

https://drive.google.com/open?id=1Ki8XH_hkJc8qlqUBqMN8KyFYHcHcFo_N
These are pretrained weights for ULMFiT Language model.

@ComputerMaestro
Copy link
Contributor Author

https://drive.google.com/open?id=1lE3DiVs7RvesGVnu2LqqNEVmJ8QET3Tq
@aviks , These are the weights for bin sentiment classifier with about 91% accuracy. Please upload these weights.

@ComputerMaestro
Copy link
Contributor Author

@aviks I will test the model for the AG news dataset as well, and will let you know the results soon.
Please let me know if there is something to change in the PR code.

PooledDense
"""

import Flux: gate, _testmode!, _dropout_kernel

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testmode! will be deprecated, on Flux#master and in the upcoming releases.

y = similar(x, size(x))
Flux.rand!(y)
y .= Flux._dropout_kernel.(y, p, 1 - p)
return y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a normal dropout layer here? Internal APIs could change without notice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is to generate the mask for the variational dropout VarDrop . I will add a custom function instead of using _dropout_kernel their.

@assert 0 ≤ p ≤ 1
cell = WeightDroppedLSTMCell(
param(init(out*4, in)),
param(init(out*4, out)),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

param is not needed anymore with Zygote


Flux.@treelike WeightDroppedLSTMCell

_testmode!(m::WeightDroppedLSTMCell, test) = (m.active = !test)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if avg_fact != 1
layer.accum = layer.accum .+ Tracker.data.(p)
for (ps, accum) in zip(p, layer.accum)
Tracker.data(ps) .= avg_fact*accum

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data is deprecated

function discriminative_step!(layers, ηL::Float64, l, opts::Vector)
@assert length(opts) == length(layers)
# Gradient calculation
grads = Tracker.gradient(() -> l, get_trainable_params(layers))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using Zygote, maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have less experience with zygote, I will look into it and will make changes

for (layer, opt) in zip(layers, opts)
opt.eta = ηl
for ps in get_trainable_params([layer])
Tracker.update!(opt, ps, grads[ps])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update! from Flux.Optimise will be better suited, since Tracker is decoupled

@aviks
Copy link
Member

aviks commented Nov 25, 2019

@ComputerMaestro sorry for not looking at this for a while. Shall we get this finished up?

The pretrained weights are at https://github.com/JuliaText/TextAnalysis.jl/releases/download/v0.6.0/bin_sentiment_classifier_weights.bson.xz

Are all of Dhairya's feedback incorporated? (Except Zygote, we'll move to it later)

@@ -112,9 +112,19 @@ module TextAnalysis
include("sequence/pos.jl")
include("sequence/sequence_models.jl")

# ULMFiT
include("ULMFiT/utils.jl")
include("ULMFiT/WikiText103_DataDeps.jl.jl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does not exist

Flux.testmode!(classifier)
loss = 0
iters = take!(gen)
(num_of_batches != : & num_of_batches < iters) && (iters = num_of_batches)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a typo here. this line does not compile!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed that

test/ulmfit.jl Outdated
@test length(params(awd)) == 5
end

@test "VarDrop" begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be @testset

@@ -0,0 +1,104 @@
@testset "Custom layers" begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test file is not included in runtests.jl

@ComputerMaestro
Copy link
Contributor Author

ComputerMaestro commented Dec 10, 2019

@aviks , there seems to be a problem with implementing this now if we are using Flux 0.10.0 (latest). Since it supports Zygote . And the model I have made is based on Tracker . So it might break if we only put some changes to make it fit for current flux version. Can I make Flux less than or equal to 0.9.0 as a requirement??? For now till there is a zygote implementation

@aviks aviks mentioned this pull request Dec 12, 2019
@aviks
Copy link
Member

aviks commented Dec 12, 2019

Updated version in #179

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants