-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting context="CpG" in methRead of bismark cytosine report failed to retain only CpG positions #136
Labels
Comments
You have to filter the file for context before feeding it to methlKit,
command line tools such as awl can do it or you can check Bismark arguments
to provide certain context only
On Wed 31. Oct 2018 at 16:27, Xuning Wang ***@***.***> wrote:
Here is a test cytosine report from Bismark:
cat test1.CX_report.txt (tab-separated)
chr1 17365 + 0 0 CHH CTA
chr1 17368 + 0 0 CHH CCT
chr1 17369 + 0 0 CHH CTA
chr1 17372 + 0 0 CHG CAG
chr1 17374 - 1 11 CHG CTG
chr1 17376 - 0 13 CHH CTC
chr1 17377 - 0 13 CHH CCT
chr1 17378 + 0 0 CG CGA
chr1 17379 - 12 2 CG CGC
chr1 17381 + 0 0 CHH CAT
This was the command I used to read the data:
file="test1.CX_report.txt"
methRaw <- methRead(list(file), sample.id=list("test"),
assembly="hg19",
header = FALSE, context="CpG",
pipeline="bismarkCytosineReport",
treatment=c(0),
mincov =1
)
This was supposed to keep only "chr1 17379 - 12 2 CG CGC", but I got all
positions with min coverage of 1.
methRaw
methylRawList object with 1 methylRaw objects
methylRaw object with 4 rows chr start end strand coverage numCs numTs
1 chr1 17374 17374 - 12 1 11
2 chr1 17376 17376 - 13 0 13
3 chr1 17377 17377 - 13 0 13
4 chr1 17379 17379 - 14 12 2
sample.id: test
assembly: hg19
context: CpG
resolution: base
treatment: 0
I tried changing CpG to CG or CHH in context=, results are the same. I am
wondering whether there is something wrong with the methRead method, when
reading bismark cytosine report?
packageVersion("methylKit")
[1] ‘1.7.5’
Thanks.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#136>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAm9EdCqg7U_NudrdtaLYYrPC1PNPhMXks5uqcF2gaJpZM4YEUxH>
.
--
Sent from mobile, excuse the brevity
|
should be quite easy to add a context filter to .procBismarkCoverage and .procBismarkCytosineReport . |
yes but we need to read the whole thing in memory first, and those files
can be quite large, this is better done via awk or if there is a way to do
this while reading with data.table on the go, then we could add it
…On Fri, Dec 7, 2018 at 9:52 PM Alexander Gosdschan ***@***.***> wrote:
should be quite easy to add a context filter to .procBismarkCoverage and
.procBismarkCytosineReport .
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#136 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAm9EcVoQAP3g7-TwBM1a3eL0q9nQjYXks5u2tT2gaJpZM4YEUxH>
.
|
we will close this issue for now, as fread does not have a filtering function at this point. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here is a test cytosine report from Bismark:
cat test1.CX_report.txt (tab-separated)
chr1 17365 + 0 0 CHH CTA
chr1 17368 + 0 0 CHH CCT
chr1 17369 + 0 0 CHH CTA
chr1 17372 + 0 0 CHG CAG
chr1 17374 - 1 11 CHG CTG
chr1 17376 - 0 13 CHH CTC
chr1 17377 - 0 13 CHH CCT
chr1 17378 + 0 0 CG CGA
chr1 17379 - 12 2 CG CGC
chr1 17381 + 0 0 CHH CAT
This was the command I used to read the data:
file="test1.CX_report.txt"
methRaw <- methRead(list(file), sample.id=list("test"),
assembly="hg19",
header = FALSE, context="CpG",
pipeline="bismarkCytosineReport",
treatment=c(0),
mincov =1
)
This was supposed to keep only "chr1 17379 - 12 2 CG CGC", but I got all positions with min coverage of 1.
methylRaw object with 4 rows
chr start end strand coverage numCs numTs
1 chr1 17374 17374 - 12 1 11
2 chr1 17376 17376 - 13 0 13
3 chr1 17377 17377 - 13 0 13
4 chr1 17379 17379 - 14 12 2
sample.id: test
assembly: hg19
context: CpG
resolution: base
treatment: 0
I tried changing CpG to CG or CHH in context=, results are the same. I am wondering whether there is something wrong with the methRead method, when reading bismark cytosine report?
Thanks.
The text was updated successfully, but these errors were encountered: