Any details when converting txt to t7 format? #15

iinjuly · 2017-03-07T04:58:32Z

Hi,
We want to run your model with other datasets. To do this, according to instructions provided, we need to convert our text into t7 format. We wrote a script to implement the conversion and tested our script on a txt sample in coco dataset, but our script just got different results compared with the txt sample's ccorresponding t7 data you provided.
Here is our conversion script:

`require 'image'
require 'nn'
require 'nngraph'
require 'cunn'
require 'cutorch'
require 'cudnn'
require 'lfs'
torch.setdefaulttensortype('torch.FloatTensor')

local alphabet = "abcdefghijklmnopqrstuvwxyz0123456789-,;.!?:'"/\|_@#$%^&*~`+-=<>()[]{} "
local dict = {}
for i = 1,#alphabet do
dict[alphabet:sub(i,i)] = i
end
ivocab = {}
for k,v in pairs(dict) do
ivocab[v] = k
end

opt = {
filenames = '',
dataset = 'cub',
batchSize = 16, -- number of samples to produce
noisetype = 'normal', -- type of noise distribution (uniform / normal).
imsize = 1, -- used to produce larger images. 1 = 64px.mg_demo.lua 2 = 80px, 3 = 96px, ...
noisemode = 'random', -- random / line / linefull1d / linefull
gpu = 1, -- gpu mode. 0 = CPU, 1 = GPU
display = 0, -- Display image: 0 = false, 1 = true
nz = 100,
doc_length = 201,
queries = 'test-caption.txt',
checkpoint_dir = '',
net_gen = '',
net_txt = '',
}

for k,v in pairs(opt) do opt[k] = tonumber(os.getenv(k)) or os.getenv(k) or opt[k] end
print(opt)
if opt.display == 0 then opt.display = false end

noise = torch.Tensor(opt.batchSize, opt.nz, opt.imsize, opt.imsize)
net_gen = torch.load(opt.checkpoint_dir .. '/' .. opt.net_gen)
net_txt = torch.load(opt.net_txt)
if net_txt.protos ~=nil then net_txt = net_txt.protos.enc_doc end

net_gen:evaluate()
net_txt:evaluate()

-- Extract all text features.
local fea_txt = torch.Tensor(5,1024)
idx=1
-- Decode text for sanity check.
local raw_txt = {}
local raw_img = {}
for query_str in io.lines(opt.queries) do
local txt = torch.zeros(1,opt.doc_length,#alphabet)
for t = 1,opt.doc_length do
local ch = query_str:sub(t,t)
local ix = dict[ch]
if ix ~= 0 and ix ~= nil then
txt[{1,t,ix}] = 1
end
end
raw_txt[#raw_txt+1] = query_str
txt = txt:cuda()
print('idx = ,', idx,'txt size',txt:size())
print("query_str = ",query_str)
tmp = net_txt:forward(txt):float():clone()
fea_txt[idx] = tmp
idx = idx + 1
print('tmp size',tmp:size())
end

torch.save('fea-txt.t7',fea_txt)
`

Why we can't got the exactly same t7 result with the provided one? Did we miss some detail? Any directions or hints would be appreciated.

xhzhao · 2017-03-08T01:12:49Z

+1, for the txt to .t7 Preprocessing

yeates · 2018-05-08T08:45:26Z

Can u figure out this now?
It should be a common problem, because everyone will meet the t7 format obstacle when they try to train themselves caption data. Maybe it's just a simple question...Give some hints or tricks Please ORZ

SreenijaK · 2019-07-23T12:31:29Z

@iinjuly were you able to process the txt file to .t7 format?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any details when converting txt to t7 format? #15

Any details when converting txt to t7 format? #15

iinjuly commented Mar 7, 2017

xhzhao commented Mar 8, 2017

yeates commented May 8, 2018

SreenijaK commented Jul 23, 2019

Any details when converting txt to t7 format? #15

Any details when converting txt to t7 format? #15

Comments

iinjuly commented Mar 7, 2017

xhzhao commented Mar 8, 2017

yeates commented May 8, 2018

SreenijaK commented Jul 23, 2019