This repository has been archived by the owner on Jun 16, 2023. It is now read-only.

15 May 08:54

unsleepy22

2.4.0

c905754

Release 2.4.0 Latest

Latest

New features

Support exactly-once with async checkpoint via rocksdb and HDFS.
Introduce new window mechanism
1. supports tumbling window and sliding window.
2. supports count window, processing time window, event time window, session window.
3. doesn't hold all data before a window is triggered, computes on data arrival.
Support gray upgrade
1. supports per worker/component gray upgrade
2. supports upgrade rollback
Add memory/rocksdb-based KV store.
HBase metrics plugin is open source
Support multiple metrics uploaders.
Add api in MetricClient to register topology-level metrics
Support component stream metrics, i.e., stream metrics aggregated in components

Improvements

Support deserialize for no-arg class in kryo
add getValue method in AsmMetric for quick assert so that unit tests/integration tests don't have to get metrics from
nimbus

Bug Fix

Fix the bug of incorrect computation of unstopped tasks when assigning topology
Fix the bug that supervisor storm.yaml is always different from nimbus storm.yaml
Fix the bug that kryo doesn't accept conf value of literal string "true"/"false"
Thanks to @gohitbear @bryant1410 @oubenruing for doc fixes.
Thanks to @zeromem @elloray @yunfan123 @iBuddha @Glowdable @waooog for bug fixes.

Assets 3

09 Jan 01:49

wuchong

2.2.1

66f1c24

Release 2.2.1

New features

Performance is improved by 200%~300%, compared to Release 2.1.1 and 0.9.8.1 in several testing scenarios, while
120%~200% compared to Flink and 300%~400% compared to Storm.
1. Restructure the batch solution
2. Improve serialization and deserialization to reduce the cost of cpu and network
3. Improve the cost of cpu on critical path and metrics
4. Improve the strategy of netty client and netty server
5. Support consume and publish of disruptor queue under batch mode
Introduce snapshot exactly once framework
1. Compared to Trident solution, the performance of new framework is increased by several times. Besides it.
2. The new framework also support "at least once" mode. Compared to the acker mechanism，it will reduce the cost
  of relative calculation in acker, and the cost of network, which will improve the performance singificantly.
Support JStorm on yarn
1. Currently, jstorm cluster is capable of fast deployments，and fast scale-in/scale-out. It will improve the utility of resource.
Re-design the solution of backpressure. Currently, the flow control is stage by stage。
1. The solution is simple and effective now. The response is much more faster when the exchange of switch on/off
  of backpressure.
2. The performance and stability is improved significantly, compared to the original solution.
Introduce Window API
1. Support tumbling window，sliding window
2. window support two collection mode, count and duration.
3. Support watermark mechanism
Introduce the support of Flux
1. Flux is a programing framework or component which is aim to help create and deploy the topology of jstorm
  quickly.
Isolate the dependencies of jstorm and user topology by maven shade plugin to fix the conflict problem.
Improve Shuffle grouping solution
1. Integrate shuffle， localOrShuffle and localFirst. The grouping solution will be auto adapted according to the assignment of topology.
2. Introduce load aware in shuffle to ensure the load balance of downstreams.
Support to configure blacklist in Nimbus to exclude some problematic nodes
Support batch mode in trident
Supervisors will synchronize cluster configuration from nimbus master automatically
Add buildTs to supervisor info and heartbeats
Add ext module for nimbus and supervisor to support external plugins
Add jstorm-elasticsearch support, thanks to @elloray for your contribution

Improvements

Restructure nimbus metrics implementation. Currently, the topology metrics runnable is event-driven.
Restructure topology master. Currently, the processor in TM is event-drive.
Add some examples to cover more scenarios
Disable stream metrics to reduce the cost of sending metrics to Nimbus
Support metrics in local mode
Improve the implementation of gauge by changing the instantaneous value of each minute，to the average value of some sample values in each minute.
Introduce an approximate histogram calculation to reduce memory usage of histogram metrics
Add Full GC and supervisor network related metrics

Bug Fix

Fix message disorder bug
Fix the bug that some connections to zookeeper are not closed by expected when encountering exception in supervisor.
The deactivate might be called by mistake when task init
The rootId might be duplicated occasionally. It will cause the unexpected message failure.
Fix the bug when local mode
Fix logwriter's bug
Some task metrics(RecvTps ProcessLatency) might not be aggregated correctly.
Fix the racing condition of AsmCounter during flushing

Misc

Please see docs in http://jstorm.io for upgrading guides.

Assets 3

09 Mar 01:55

longdafeng

2.1.1

b515985

Publish 2.1.1 to Maven Center repository

For Chinese release notes, please refer to https://github.com/alibaba/jstorm/blob/master/history_cn.md

New features

1.5~6x performance boost from worst to best scenarios compared to JStorm-2.1.0
Add application-level auto-batch
Add independent control channel to separate control msgs from biz msgs to guarantee high priority for control msgs
Dramatic performance boost in metrics, see "Improvements" section
Support jdk1.8
Add Nimbus hook and topology hook
Metrics system:
1. Support disable/enable metrics on the fly
2. Add jstorm metrics design docs, see JSTORM-METRICS.md
JStorm web UI:
1. Add zookeeper viewer in web UI, thanks to @dingjun84
2. Add log search and deep log search, support both backward search and forward search
3. Support log file download
Support changing log level on the fly
Change error structure in zk, add errorLevel, errorCode and duration.
Add supervisor health check
Add -Dexclude.jars option to enable filtering jars manually

Improvements

Metrics:
1. use JHistogram/JMeter instead of Histogram/Meter, change internal Clock.tick to System.currentTimeMillis to improve performance (50+% boost in Meter and 25%+ boost in Histogram)
2. add TupleLifeCycle metric
3. add supervisor metrics: total_cpu_usage, total_mem_usage, disk_usage
4. remove some unnecessary metrics like emitTime, etc.
5. Use HeapByteBuffer instead of List to transmit metric data points, reduce 60+% metrics memory usage
6. Change sample rate from 10% to 5% by default
7. Remove AsmTimer and related code
Log related:
1. Use logback by default instead of log4j, exclude slf4j-log4j12 dependency
2. Use jstorm.log.dir property instead of ${jstorm.home}/logs, see jstorm.logback.xml
3. Change all log4j Logger's to slf4j Logger's
4. Set default log page size(log.page.size) in defaults.yaml to 128KB (web UI)
5. Change topology log structure, add ${topology.name} directory, see jstorm.logback.xml
6. Add timestamp in supervisor/nimbus gc log files; backup worker gc log before launching a new worker;
7. Set logback/log4j file encoding to UTF-8
Refine backpressure stragety to avoid over-backpressure
Change acker pending rotating map to single thread to improve performance
Update RefreshConnections to avoid downloading assignments from zk frequently
Change default memory of Supervisor to 1G (previous 512MB)
Use ProcessLauncher to launch processes
Add DefaultUncaughtExceptionHandler for supervisor and nimbus
Change local ports to be different from 0.9.x versions (supervisor.slots.ports.base, nimbus.thrift.port,
nimbus.deamon.logview.port, supervisor.deamon.logview.port)
Change highcharts to echarts to avoid potential license violation
Dependency upgrades:
1. Upgrade kryo to 2.23.0
2. Upgrade disruptor to 3.2.2

Bug fix

Fix deadlock when starting workers
Fix the bug that when localstate file is empty, supervisor can't start
Fix kryo serialization for HeapByteBuffer in metrics
Fix total memory usage calculation
Fix the bug that empty worker is assigned when configured worker number is bigger than the actual number for user defined scheduler
Fix UI log home directory
Fix XSS security bug in web UI
Don't start TopologyMetricsRunnable thread in local mode, thanks to @L-Donne
Fix JSTORM-141, JSTORM-188 that TopologyMetricsRunnable consumes too much CPU
Remove MaxTenuringThreshold JVM option support jdk1.8, thanks to @249550148
Fix possible NPE in MkLocalShuffer

Deploy and scripts

Add cleanup for core dumps
Add supervisor health check in healthCheck.sh
Change jstorm.py to terminate the original python process when starting nimbus/supervisor

Upgrade guide

JStorm 2.1.1 is mostly compatible with 2.1.0, but it's better to restart your topologies to finish the upgrade.
If you're using log4j, be cautious that we have switched default logging system to logback, if you still want to use log4j, please add "user.defined.log4j.conf: jstorm.log4j.properties" to your conf/storm.yaml.
If you're using slf4j-api + log4j, please add slf4j-log4j12 dependency in your pom config.

Assets 4

12 Nov 10:02

wuchong

2.1.0

9808b31

Release 2.1.0

This version is for Alibaba Global Shopping Festival, November 11th 2015.

New features

Totally redesign Web UI
1. Make the UI more beatiful
2. Improve Web UI speed much.
3. Add Cluster/Topology Level Summarized Metrics in recent 30 minutes.
4. Add DAG in the Web UI, support Uer Interaction to get key information such as emit, tuple lifecycle, tps
Redesign Metrics/Monitor System
1. New metrics core, support sample with more metric, avoid noise, merge metrics automatically for user.
2. No metrics will be stored in ZK
3. Support metrics HA
4. Add more useful metrics, such as tuple lifecycle, netty metrics, disk space etc. accurately get worker memory
5. Support external storage plugin to store metrics.
Implement Smart BackPressure
1. Smart Backpressure, the dataflow will be more stable, avoid noise to trigger
2. Easy to manual control Backpressure
Implement TopologyMaster
1. Redesign hearbeat mechanism, easily support 6000+ tasks
2. Collect all task's metrics, do merge job, release Nimbus pressure.
3. Central Control Coordinator, issue control command
Redesign ZK usage, one set of ZK support more 2000+ hardware nodes.
1. No dynamic data in ZK, such as heartbeat, metrics, monitor status.
2. Nimbus reduce visiting ZK frequence when serve thrift API.
3. Reduce visiting ZK frequence, merge some task level ZK node.
4. Reduce visiting ZK frequence, remove useless ZK node, such as empty taskerror node
5. Tuning ZK cache
6. Optimize ZK reconnect mechanism
Tuning Executor Batch performance
1. Add smart batch size setting
2. Remove memory copy
3. Directly issue tuple without batch for internal channel
4. Set the default Serialize/Deserialize method as Kryo
Set the default Serialized/Deserialized method as Kryo to improve performance.
Support dynamic reload binary/configuration
Tuning LocalShuffle performance, Set 3 level priority, local worker, local node, other node, add dynamic check queue status, connection status.
Optimize Nimbus HA, only the highest priority nimbuses can be promoted as master

Improvement

Supervisor automatically dump worker jstack/jmap, when worker's status is invalid.
Supervisor can generate more ports according to memory.
Supervisor can download binary more time.
Support set logdir in configuration
Add configuration "nimbus.host.start.supervisor"
Add supervisor/nimbus/drpc gc log
Adjust jvm parameter 1. set -Xmn 1/2 of heap memory 2. set PermSize to 1/32 and MaxPermSize 1/16 of heap memory; 3. set -Xms by "worker.memory.min.size"。
Refine ZK error schema, when worker is dead, UI will report error
Add function to zktool utility, support remove all topology znodes, support list
Optimize netty client.
Dynamic update connected task status by network connection, not by ZK znode.
Add configuration "topology.enable.metrics".
Classify all topology log into one directory by topologyName.

Bug fix

Skip download same binary when assigment has been changed.
Skip start worker when binary is invalid.
Use correct configuration map in a lot of worker thread
In the first step Nimbus will check topologyName or not when submit topology
Support fieldGrouping for Object[]
For drpc single instance under one configuration
In the client topologyNameExists interface，directly use trhift api
Fix failed to restart due to topology cleanup thread's competition

Deploy and scripts

Optimize cleandisk.sh, avoid delete useful worker log

Assets 4

05 Aug 12:57

longdafeng

2.0.4-SNAPSHOT

e935da9

Merge for Apache Pre-release

Pre-release

Release 2.0.4-SNAPSHOT

New features

Redesign Metric/Monitor system, new RollingWindow/Metrics/NettyMetrics, all data will send/recv through thrift
Redesign Web-UI, the new Web-UI code is clear and clean
Add NimbusCache Layer, using RocksDB and TimeCacheWindow
Refactoring all ZK structure and ZK operation
Refactoring all thrift structure
Merge jstorm-client/jstorm-client-extension/jstorm-core 3 modules into jstorm－core
Set the dependency version same as storm
Sync apache-storm-0.10.0-beta1 all java code
Switch log system to logback
Upgrade thrift to apache thrift 0.9.2
Performance tuning Huge topology more than 600 workers or 2000 tasks
Require jdk7 or higher

Release 0.9.7.1

New Features

Batch the tuples whose target task is same, before sending out（task.batch.tuple=true，task.msg.batch.size=4）.
LocalFirst grouping is updated. If all local tasks are busy, the tasks of outside nodes will be chosen as target task instead of waiting on the busy local task.
Support user to reload the application config when topology is running.
Support user to define the task heartbeat timeout and task cleanup timeout for topology.
Update the wait strategy of disruptor queue to no-blocking mode "TimeoutBlockingWaitStrategy"
Support user to define the timeout of discarding messages that are pending for a long time in netty buffer.
Update the message processing structure. The virtualPortDispatch and drainer thread are removed to reduce the unnecessary cost of cpu and the transmitting of tuples
Add jstorm parameter "--include-jars" when submit topology, add these jar to classpath
Nimbus or Supervisor suicide when the local ip is 127.0.0.0
Add user-define-scheduler example
Merge Supervisor's syncSupervisor and syncProcess

Bug Fix

Improve the GC setting.
Fix the bug that task heartbeat might not be updated timely in some scenarioes.
Fix the bug that the reconnection operation might be stick for a unexpected period when the connection to remote worker is shutdown and some messages are buffer in netty.
Reuse thrift client when submit topology
Avoid repeatedly download binary when failed to start worker.

Changed setting

Change task's heartbeat timeout to 4 minutes
Set the netty client thread pool(clientScheduleService) size as 5

Deploy and scripts

Improve cleandisk.sh, avoid delete current directory and /tmp/hsperfdata_admin
Add executable attribute for the script under example
Add parameter to stat.sh, which can be used to start supervisor or not. This is useful under virtual

Release 0.9.7

New Features

Support dynamic scale-out/scale-in of worker, spout, bolt or acker without stopping the service of topology.
When enable cgroup, Support the upper limit control of cpu core usage. Default setting is 3 cpu cores.
Update the mechanism of task heartbeats to make heartbeat to track the status of spout/bolt execute thread correctly.
Support to add jstorm prefix info(clusterName, topologyName, ip:port, componentName, taskId, taskIndex) for worker/task log
Check the heartbeat of supervisor when topology assignment to ensure no worker will be assigned into a dead supervisor
Add api to query the task/worker's metric info, e.g. load status of task queue, worker cpu usage, worker mem usage...
Try to re-download jars when staring worker fails several times to avoid potential corruption of jars
Add Nimbus ZK cache, accelerate nimbus read zk
Add thrift api getVersion, it will be used check between the client jstorm version and the server jstorm version.
Update the metrics' structure to Alimonitor
Add exclude-jar parameter into jstorm.py, which avoid class conflict when submit topology

Bug Fix

Fix the no response problem of supervisor process when subimtting big amout topologys in a short time
When submitting two or more topologys at the same time, the later one might be failed.
TickTuple does not need to be acked. Fix the incorrect count of failure message.
Fix the potential incorrect assignment when use.old.assignment=true
Fix failed to remove some zk nodes when kill topology
Fix failed to restart topology, when nimbus do assignment job.
Fix NPE when register metrics
Fix failed to read ZK monitor znode through zktool
Fix exception when enable classload and local mode
Fix duplicate log when enable user-defined logback in local mode

Changed Setting

Set Nimbus jvm memory size as 4G
Set hearbeat from supervisor to nimbus timeout from 60s to 180s
In order to avoid OOM, set storm.messaging.netty.max.pending as 4
Set task queue size as 1024, worker's total send/receive queue size as 2048

Deploy and scripts

Add rpm build spec
Add deploy files of jstorm for rpm package building
Enable the cleandisk cronjob every hour, reserve coredump for only one hour.

Assets 4

16 Feb 10:09

longdafeng

0.9.6.3

e6884c6

0.9.6.3

New features

Implement tick tuple
Support logback
Support to load the user defined configuration file of log4j
Enable the display of user defined metrics in web UI
Add "topologyName" parameter for "jstorm list" command
Support the use of ip and hostname at the same for user defined schedule
Support junit test for local mode
Enable client command(e.g. jstorm jar) to load self-defined storm.yaml

Bug fix

Add activate and deactivate api of spout, which are used in nextTuple prepare phase
Update the support of multi language
Check the worker's heartbeat asynchronously to speed up the lunch of worker
Add the check of worker's pid to speed up the detect of dead worker
Fix the high cpu load of disruptor producer when disruptor queue is full
Remove the confused exception reported by disruptor queue when killing worker
Fix the failure problem of "jstorm restart" client command
Report error when user submits the jar built on a incompatible jstorm release
Fix the problem that one log will printed twice when user define a configuration of log4j or logback on local mode
Fix the potential exception when killing topology on local mode
Forbid user to change the log level of jstorm log
Add a configuration template of logback
Fix the problem that process the upload of lib jar as application jar
Makesure the clean of ZK node for a topology which is removed
Add the information of topology name when java core dump
Fix the incorrect value of -XX:MaxTenuringThreshold. Currently, the default value of jstorm is 20, but the max value in JDK8 is 15.
Fix the potential reading failure of cpu core number, which may cause the supervisor slot to be set to 0
Fix the "Address family not supported by protocol family" error on local mode
Do not start logview http server on local mode
Add the creation of log dir in supervisor alive checking scription
Check the correctness of ip specified in configuration file before starting nimbus
Check the correctness of env variable $JAVA_HOME/$JSTORM_HOME/$JSTORM_CONF_DIR before starting jstorm service
Specify the log dir for rpm installation
Add reading permission of /home/admin/jstorm and /home/admin/logs for all users after rpm installation
Config local temporay ports when rpm installation
Add noarch rpm package

Assets 4

01 Dec 07:23

longdafeng

0.9.6.2

9fc5566

0.9.6.2

Add option to switch between BlockingQueue and Disruptor
Fix the bug which under sync netty mode, client failed to send message to server
Fix the bug let web UI can dispaly 0.9.6.1 cluster
Fix the bug topology can be submited without main jar but a lot of little jar
Fix the bug restart command
Fix the bug trident bug
Add the validation of topology name, component name... Only A-Z, a-z, 0-9, '_', '-', '.' are valid now.
Fix the bug close thrift client

Assets 3

17 Nov 02:42

longdafeng

0.9.6.2-rc

9fc5566

0.9.6.2-rc Pre-release

Pre-release

Improve user experience from Web UI
1. Add jstack link
2. Add worker log link in supervisor page
3. Add Web UI log encode setting "gbk" or "utf-8"
4. Show starting tasks in component page
5. Show dead task's information in UI
6. Fix the bug that error info can not be displayed in UI when task is restarting
Add restart command, with this command, user can reload configuration, reset worker/task parallism
Upgrade curator/disruptor/guava version
Revert json lib to google-simple json, wrap all json operation into two utility method
Add new storm submit api, supporting submit topology under java
Enable launch process with backend method
Set "spout.pending.full.sleep" default value as true
Fix the bug user define sceduler not support a list of workers
Add disruptor/JStormUtils junit test
Enable user to configure the name of monitor name of alimonitor
Add tcp option "reuseAddress" in netty framework
Fix the bug: When spout does not implement the ICommitterTrident interface, MasterCoordinatorSpout will stick on commit phase.

Assets 2

11 Oct 12:54

longdafeng

0.9.6.1

4fa7185

0.9.6.1

Add management of multiclusters in web UI.
Merge trident part from storm-0.9.3
Use fastjson replace gson
Reorganization the code generating metrics json
Get jstorm version from $JSTORM_HOME/RELEASE instead of hardcode
Change task deserialize thread's SingleThreadDisruptorQueue to MultiThreadDisruptorQueue
Fix web ui display wrong number of workers in Supervisor page
Fix taskheart beat thread competition in accessing task map
Fix null pointer exception when killing worker and read worker's hearbeat object
Netty client connect to server only in NettyClient module.
Add break loop operation when netty client connection is closed
Fix the bug that topology warning flag present in cluster page is not consistent with error information present in topology page
Add recovery function when the data of task error information is corrupted
Fix the bug that the metric data can not be uploaded onto Alimonitor when ugrading from pre-0.9.6 to 0.9.6 and executing pkill java without restart the topologying
Fix the bug that zeroMq failed to receive data
Add interface to easily setting worker's memory
Set default value of topology.alimonitor.metrics.post to false
Only start NETTY_SERVER_DECODE_TIME for netty server
Keep compatible with Storm for local mode
Print rootId when tuple failed
In order to keep compatible with Storm, add submitTopologyWithProgressBar interface
Upgrade netty version from 3.2.7 to 3.9.0
Support assign topology to user-defined supervosors

Assets 3

23 Sep 03:18

longdafeng

0.9.6

4e295a0

0.9.6 release

Update UI
- Display the metrics information of task and worker
- warning flag when errors occur for a topology
- Add link from supervisor page to task page
Send metrics data to Alimonitor
Add metrics interface for user
Add task.cleanup.timeout.sec setting to let task gently cleanup
Set the worker's log name as topologyName-worker-port.log
Add setting "worker.redirect.output.file", so worker can redirect System.out/System.err to one setting file
Add storm list command
Add closing channel check in netty client to avoid double close
Add connecting check in netty client to avoid connecting one server twice at one time

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New features

Improvements

Bug Fix

New features

Improvements

Bug Fix

Misc

New features

Improvements

Bug fix

Deploy and scripts

Upgrade guide

New features

Improvement

Bug fix

Deploy and scripts

Release 2.0.4-SNAPSHOT

New features

Release 0.9.7.1

New Features

Bug Fix

Changed setting

Deploy and scripts

Release 0.9.7

New Features

Bug Fix

Changed Setting

Deploy and scripts

New features

Bug fix

Releases: alibaba/jstorm

Release 2.4.0

New features

Improvements

Bug Fix

Release 2.2.1

New features

Improvements

Bug Fix

Misc

Publish 2.1.1 to Maven Center repository

New features

Improvements

Bug fix

Deploy and scripts

Upgrade guide

Release 2.1.0

New features

Improvement

Bug fix

Deploy and scripts

Merge for Apache

Release 2.0.4-SNAPSHOT

New features

Release 0.9.7.1

New Features

Bug Fix

Changed setting

Deploy and scripts

Release 0.9.7

New Features

Bug Fix

Changed Setting

Deploy and scripts

0.9.6.3

New features

Bug fix

0.9.6.2

0.9.6.2-rc

0.9.6.1

0.9.6 release