Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Juno becomes extremely laggy with large HTMLElement Arrays in the workspace #221

Closed
nilshg opened this issue Jan 8, 2019 · 10 comments
Closed

Comments

@nilshg
Copy link

nilshg commented Jan 8, 2019

As discussed on Slack, when I scrape a few hundred websites and store the results in an Array of HTMLElement{:HTML}'s, Juno becomes extremely slow if the workspace pane is open. Sample code to reproduce:

using HTTP, Gumbo, Cascadia, JLD2

# Root page
rt = "http://www.ncpe.ie/pharmacoeconomic-evaluations/archive/"

# Function to extract website source using HTTP and Gumbo
get_source(s) = parsehtml(String(HTTP.get(rt).body)).root

# Get achive from page
archive = get_source(rt)

# Get HTML node for all drugs
nodes = eachmatch(Selector("span.the_article"), archive)

# Get the link to each drug's page
all_drugs = [i.children[1].attributes["href"] for i in nodes]

# Get HTML source for each drug (replace with serialization after first scrape)
all_sources = [get_source(drug) for drug ∈ all_drugs]
#@save "all_sources.jld2" all_sources 
#@load "all_sources.jld2" all_sources

Details

  • Atom version: 1.33.1
  • Julia version: 1.0.1
  • OS: Windows 10
  • Package versions:
    • Atom.jl: 0.7.12
    • julia-client: 0.7.12
    • ink: 0.9.13
@dylanfesta
Copy link

I experienced the same problem when loading large DataFrames with somehow convoluted data structures inside. Juno becomes unusable, pausing for a long time after each command (even the simplest ones!). Closing the workspace pane restored the normal responsiveness. Thanks for the workaround, it saved me from a lot of hassle!

The issue now is not being able to use the workspace as long as I deal with these data structures.

@pearlzli
Copy link

I'm having the same problem with a largish (~4000 rows) DataFrame, but for me, closing the workspace pane doesn't help. I wonder if #223 might help?

@pfitzseb
Copy link
Member

pfitzseb commented Jul 29, 2019

Probably not, but you can just add a ; at the end of a line to check if it makes a difference.

@pfitzseb
Copy link
Member

pfitzseb commented Aug 7, 2019

Fixed with JunoLab/Atom.jl@445e3c9, afaict.

@pfitzseb pfitzseb closed this as completed Aug 7, 2019
@aaowens
Copy link

aaowens commented Oct 26, 2019

I'm experiencing this exact problem today. After executing any command in the REPL, I need to wait a few seconds before it lets me type anything else in the REPL. One of my variables is a huge XML file. Printing it in the REPL produces 20 seconds of scrolling text. Julia 1.2, Juno 0.7.2.

Closing the workspace pane fixes the problem for me.

@pfitzseb
Copy link
Member

What does "huge XML file" mean? What package produces that and what does show output?

@aaowens
Copy link

aaowens commented Oct 26, 2019

I load a large (100 MB) XML file using the LightXML package. Here's a MWE which generates its own data.

using LightXML
# create an empty XML document
xdoc = XMLDocument()

# create & attach a root node
xroot = create_root(xdoc, "States")

# create the first child
for i = 1:2_000_000
    xs1 = new_child(xroot, "State$i")
    add_text(xs1, "$i")
end

If you run this code, the Juno REPL should become very laggy.

@aaowens
Copy link

aaowens commented Nov 24, 2019

Should this issue be reopened? I still experience this if I run my MWE on the current version of Juno and Julia 1.3 (or 1.2). @pfitzseb

@pfitzseb
Copy link
Member

pfitzseb commented Nov 25, 2019

What version of Atom.jl are you on? I'm not seeing any issues at all with your MWE (even with 20e6 nodes).

@aaowens
Copy link

aaowens commented Nov 25, 2019

Atom.jl v0.11.3, Atom itself is 1.41.0. I'm running Linux.

To be more specific, every time I hit enter in the REPL, I have to wait about 2 seconds before I can enter anything in the julia> prompt. If I type many keys during this time, they all appear at once.

Closing the workspace pane or zeroing out the variables fixes the lag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants