Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change load node and edge from local to cpu #83

Merged
merged 2 commits into from
Aug 16, 2022
Merged

change load node and edge from local to cpu #83

merged 2 commits into from
Aug 16, 2022

Conversation

miaoli06
Copy link

PR types

PR changes

Describe

@huwei02
Copy link
Collaborator

huwei02 commented Aug 16, 2022

lgtm

@jiaoxuewu jiaoxuewu merged commit c946cf3 into xuewujiao:gpugraph Aug 16, 2022
xuewujiao added a commit that referenced this pull request Aug 18, 2022
* change load node and edge from local to cpu (#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(#85)

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Thunderbrook added a commit that referenced this pull request Aug 22, 2022
* change load node and edge from local to cpu (#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(#85)

Co-authored-by: yangjunchao <[email protected]>

* support ssdsparsetable;test=develop (#81)

* graph sample v2

* remove log

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: danleifeng <[email protected]>
lxsbupt added a commit that referenced this pull request Nov 8, 2022
* Optimizing the zero key problem in the push phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Performance optimization, segment gradient merging

* Performance optimization, segment gradient merging

* Optimize pullsparse and increase keys aggregation

* sync gpugraph to gpugraph_v2 (#86)

* change load node and edge from local to cpu (#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(#85)

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>

* [GPUGraph] graph sample v2 (#87)

* change load node and edge from local to cpu (#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(#85)

Co-authored-by: yangjunchao <[email protected]>

* support ssdsparsetable;test=develop (#81)

* graph sample v2

* remove log

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: danleifeng <[email protected]>

* Release cpu graph

* uniq nodeid (#89)

* compatible whole HBM mode (#91)

Co-authored-by: yangjunchao <[email protected]>

* Gpugraph v2 (#93)

* compatible whole HBM mode

* unify flag for graph emd storage mode and graph struct storage mode

* format

Co-authored-by: yangjunchao <[email protected]>

* split generate batch into multi stage (#92)

* split generate batch into multi stage

* fix conflict

Co-authored-by: root <[email protected]>

* [GpuGraph] Uniq feature (#95)

* uniq feature

* uniq feature

* uniq feature

* [GpuGraph]  global startid (#98)

* uniq feature

* uniq feature

* uniq feature

* global startid

* load node edge seperately and release graph (#99)

* load node edge seperately and release graph

* load node edge seperately and release graph

Co-authored-by: root <[email protected]>

* v2 infer (#102)

* optimize begin pass and end pass (#106)

Co-authored-by: yangjunchao <[email protected]>

* fix ins no (#104)

* [GPUGraph] fix FillOneStep args (#107)

* fix ins no

* fix FillOnestep args

* fix bug for whole hbm mode (#110)

Co-authored-by: yangjunchao <[email protected]>

* [GPUGraph] fix infer && add infer_table_cap (#108)

* fix ins no

* fix FillOnestep args

* fix infer && add infer table cap

* fix infer

* 【PSCORE】perform ssd sparse table  (#111)

* perform ssd sparsetable;test=develop

Conflicts:
	paddle/fluid/framework/fleet/ps_gpu_wrapper.cc

* perform ssd sparsetable;test=develop

* remove debug code;

* remove debug code;

* add jemalloc cmake;test=develop

* fix wrapper;test=develop

* fix sample core (#114)

* [GpuGraph] optimize shuffle batch (#115)

* fix sample core

* optimize shuffle batch

* release gpu mem when sample end (#116)

Co-authored-by: root <[email protected]>

* fix class not found err (#118)

Co-authored-by: root <[email protected]>

* optimize sample (#117)

* optimize sample

* optimize sample

Co-authored-by: yangjunchao <[email protected]>

* fix clear gpu mem (#119)

Co-authored-by: root <[email protected]>

* fix sample core (#121)

Co-authored-by: yangjunchao <[email protected]>

* add ssd cache (#123)

* add ssd cache;test=develop

* add ssd cache;test=develop

* add ssd cache;test=develop

* add multi epoch train & fix train table change ins & save infer embeding  (#129)

* add multi epoch train & fix train table change ins & save infer embedding

* change epoch finish judge

* change epoch finish change

Co-authored-by: root <[email protected]>

* Add debug log (#131)

* Add debug log

* Add debug log

Co-authored-by: root <[email protected]>

* optimize mem in  uniq slot feature (#130)

* [GpuGraph] cherry pick var slot feature && fix load multi path node (#136)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

Co-authored-by: huwei02 <[email protected]>

* [GpuGraph] fix kernel overflow (#138)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

* fix kernel overflow && add max feature num flag

Co-authored-by: huwei02 <[email protected]>

* fix ssd cache;test=develop (#139)

* slot feature secondary storage (#140)

* slot feature secondary storage

* slot feature secondary storage

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: root <[email protected]>
Co-authored-by: xuewujiao <[email protected]>
Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: Thunderbrook <[email protected]>
Co-authored-by: danleifeng <[email protected]>
Co-authored-by: huwei02 <[email protected]>
lxsbupt pushed a commit to lxsbupt/Paddle that referenced this pull request Nov 29, 2022
* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>
lxsbupt added a commit to lxsbupt/Paddle that referenced this pull request Nov 29, 2022
* Optimizing the zero key problem in the push phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Performance optimization, segment gradient merging

* Performance optimization, segment gradient merging

* Optimize pullsparse and increase keys aggregation

* sync gpugraph to gpugraph_v2 (xuewujiao#86)

* change load node and edge from local to cpu (xuewujiao#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(xuewujiao#85)

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>

* [GPUGraph] graph sample v2 (xuewujiao#87)

* change load node and edge from local to cpu (xuewujiao#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(xuewujiao#85)

Co-authored-by: yangjunchao <[email protected]>

* support ssdsparsetable;test=develop (xuewujiao#81)

* graph sample v2

* remove log

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: danleifeng <[email protected]>

* Release cpu graph

* uniq nodeid (xuewujiao#89)

* compatible whole HBM mode (xuewujiao#91)

Co-authored-by: yangjunchao <[email protected]>

* Gpugraph v2 (xuewujiao#93)

* compatible whole HBM mode

* unify flag for graph emd storage mode and graph struct storage mode

* format

Co-authored-by: yangjunchao <[email protected]>

* split generate batch into multi stage (xuewujiao#92)

* split generate batch into multi stage

* fix conflict

Co-authored-by: root <[email protected]>

* [GpuGraph] Uniq feature (xuewujiao#95)

* uniq feature

* uniq feature

* uniq feature

* [GpuGraph]  global startid (xuewujiao#98)

* uniq feature

* uniq feature

* uniq feature

* global startid

* load node edge seperately and release graph (xuewujiao#99)

* load node edge seperately and release graph

* load node edge seperately and release graph

Co-authored-by: root <[email protected]>

* v2 infer (xuewujiao#102)

* optimize begin pass and end pass (xuewujiao#106)

Co-authored-by: yangjunchao <[email protected]>

* fix ins no (xuewujiao#104)

* [GPUGraph] fix FillOneStep args (xuewujiao#107)

* fix ins no

* fix FillOnestep args

* fix bug for whole hbm mode (xuewujiao#110)

Co-authored-by: yangjunchao <[email protected]>

* [GPUGraph] fix infer && add infer_table_cap (xuewujiao#108)

* fix ins no

* fix FillOnestep args

* fix infer && add infer table cap

* fix infer

* 【PSCORE】perform ssd sparse table  (xuewujiao#111)

* perform ssd sparsetable;test=develop

Conflicts:
	paddle/fluid/framework/fleet/ps_gpu_wrapper.cc

* perform ssd sparsetable;test=develop

* remove debug code;

* remove debug code;

* add jemalloc cmake;test=develop

* fix wrapper;test=develop

* fix sample core (xuewujiao#114)

* [GpuGraph] optimize shuffle batch (xuewujiao#115)

* fix sample core

* optimize shuffle batch

* release gpu mem when sample end (xuewujiao#116)

Co-authored-by: root <[email protected]>

* fix class not found err (xuewujiao#118)

Co-authored-by: root <[email protected]>

* optimize sample (xuewujiao#117)

* optimize sample

* optimize sample

Co-authored-by: yangjunchao <[email protected]>

* fix clear gpu mem (xuewujiao#119)

Co-authored-by: root <[email protected]>

* fix sample core (xuewujiao#121)

Co-authored-by: yangjunchao <[email protected]>

* add ssd cache (xuewujiao#123)

* add ssd cache;test=develop

* add ssd cache;test=develop

* add ssd cache;test=develop

* add multi epoch train & fix train table change ins & save infer embeding  (xuewujiao#129)

* add multi epoch train & fix train table change ins & save infer embedding

* change epoch finish judge

* change epoch finish change

Co-authored-by: root <[email protected]>

* Add debug log (xuewujiao#131)

* Add debug log

* Add debug log

Co-authored-by: root <[email protected]>

* optimize mem in  uniq slot feature (xuewujiao#130)

* [GpuGraph] cherry pick var slot feature && fix load multi path node (xuewujiao#136)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

Co-authored-by: huwei02 <[email protected]>

* [GpuGraph] fix kernel overflow (xuewujiao#138)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

* fix kernel overflow && add max feature num flag

Co-authored-by: huwei02 <[email protected]>

* fix ssd cache;test=develop (xuewujiao#139)

* slot feature secondary storage (xuewujiao#140)

* slot feature secondary storage

* slot feature secondary storage

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: root <[email protected]>
Co-authored-by: xuewujiao <[email protected]>
Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: Thunderbrook <[email protected]>
Co-authored-by: danleifeng <[email protected]>
Co-authored-by: huwei02 <[email protected]>
lxsbupt pushed a commit to lxsbupt/Paddle that referenced this pull request Dec 17, 2022
* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>
lxsbupt added a commit to lxsbupt/Paddle that referenced this pull request Dec 17, 2022
* Optimizing the zero key problem in the push phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Performance optimization, segment gradient merging

* Performance optimization, segment gradient merging

* Optimize pullsparse and increase keys aggregation

* sync gpugraph to gpugraph_v2 (xuewujiao#86)

* change load node and edge from local to cpu (xuewujiao#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(xuewujiao#85)

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>

* [GPUGraph] graph sample v2 (xuewujiao#87)

* change load node and edge from local to cpu (xuewujiao#83)

* change load node and edge

* remove useless code

Co-authored-by: root <[email protected]>

* extract pull sparse as single stage(xuewujiao#85)

Co-authored-by: yangjunchao <[email protected]>

* support ssdsparsetable;test=develop (xuewujiao#81)

* graph sample v2

* remove log

Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: danleifeng <[email protected]>

* Release cpu graph

* uniq nodeid (xuewujiao#89)

* compatible whole HBM mode (xuewujiao#91)

Co-authored-by: yangjunchao <[email protected]>

* Gpugraph v2 (xuewujiao#93)

* compatible whole HBM mode

* unify flag for graph emd storage mode and graph struct storage mode

* format

Co-authored-by: yangjunchao <[email protected]>

* split generate batch into multi stage (xuewujiao#92)

* split generate batch into multi stage

* fix conflict

Co-authored-by: root <[email protected]>

* [GpuGraph] Uniq feature (xuewujiao#95)

* uniq feature

* uniq feature

* uniq feature

* [GpuGraph]  global startid (xuewujiao#98)

* uniq feature

* uniq feature

* uniq feature

* global startid

* load node edge seperately and release graph (xuewujiao#99)

* load node edge seperately and release graph

* load node edge seperately and release graph

Co-authored-by: root <[email protected]>

* v2 infer (xuewujiao#102)

* optimize begin pass and end pass (xuewujiao#106)

Co-authored-by: yangjunchao <[email protected]>

* fix ins no (xuewujiao#104)

* [GPUGraph] fix FillOneStep args (xuewujiao#107)

* fix ins no

* fix FillOnestep args

* fix bug for whole hbm mode (xuewujiao#110)

Co-authored-by: yangjunchao <[email protected]>

* [GPUGraph] fix infer && add infer_table_cap (xuewujiao#108)

* fix ins no

* fix FillOnestep args

* fix infer && add infer table cap

* fix infer

* 【PSCORE】perform ssd sparse table  (xuewujiao#111)

* perform ssd sparsetable;test=develop

Conflicts:
	paddle/fluid/framework/fleet/ps_gpu_wrapper.cc

* perform ssd sparsetable;test=develop

* remove debug code;

* remove debug code;

* add jemalloc cmake;test=develop

* fix wrapper;test=develop

* fix sample core (xuewujiao#114)

* [GpuGraph] optimize shuffle batch (xuewujiao#115)

* fix sample core

* optimize shuffle batch

* release gpu mem when sample end (xuewujiao#116)

Co-authored-by: root <[email protected]>

* fix class not found err (xuewujiao#118)

Co-authored-by: root <[email protected]>

* optimize sample (xuewujiao#117)

* optimize sample

* optimize sample

Co-authored-by: yangjunchao <[email protected]>

* fix clear gpu mem (xuewujiao#119)

Co-authored-by: root <[email protected]>

* fix sample core (xuewujiao#121)

Co-authored-by: yangjunchao <[email protected]>

* add ssd cache (xuewujiao#123)

* add ssd cache;test=develop

* add ssd cache;test=develop

* add ssd cache;test=develop

* add multi epoch train & fix train table change ins & save infer embeding  (xuewujiao#129)

* add multi epoch train & fix train table change ins & save infer embedding

* change epoch finish judge

* change epoch finish change

Co-authored-by: root <[email protected]>

* Add debug log (xuewujiao#131)

* Add debug log

* Add debug log

Co-authored-by: root <[email protected]>

* optimize mem in  uniq slot feature (xuewujiao#130)

* [GpuGraph] cherry pick var slot feature && fix load multi path node (xuewujiao#136)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

Co-authored-by: huwei02 <[email protected]>

* [GpuGraph] fix kernel overflow (xuewujiao#138)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

* fix kernel overflow && add max feature num flag

Co-authored-by: huwei02 <[email protected]>

* fix ssd cache;test=develop (xuewujiao#139)

* slot feature secondary storage (xuewujiao#140)

* slot feature secondary storage

* slot feature secondary storage

Co-authored-by: yangjunchao <[email protected]>

Co-authored-by: root <[email protected]>
Co-authored-by: xuewujiao <[email protected]>
Co-authored-by: miaoli06 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chao9527 <[email protected]>
Co-authored-by: yangjunchao <[email protected]>
Co-authored-by: Thunderbrook <[email protected]>
Co-authored-by: danleifeng <[email protected]>
Co-authored-by: huwei02 <[email protected]>
@miaoli06 miaoli06 deleted the gpugraph0813 branch March 14, 2023 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants