hayidahubei的个人博客分享 http://blog.sciencenet.cn/u/hayidahubei

博文

0_based and 1_based (Sam file, Bam file and Bed file)

已有 2416 次阅读 2019-4-11 00:27 |个人分类:基因组注释信息|系统分类:科研笔记| 基因组注释, 0_based和1_based

0_based numbering: the initial element of a sequence is assigned the index 0;

1_based numbering: the initial element of a sequence is assigned the index 1;

 

BAM/BED data are 0-based and SAM data are 1-based.

An example shows below:

The screenshot came from a sam file. The sequence (10.4|823|29663|59.364374575|+|61.9047619048|73.3|1) was mapped to mm10, starting from 3259346 in chr10. As sam file is 1_base, which means first locus 3259346 in chr10 is the first nucleotide “A” of the sequence.

samFile.png

However, after I converted this to bed file using bamToBed, the iniial locus of the region converts from 3259346 to 3259345. And as Bed file is 0-based, so the second locus in chr10 is the first nucleotide “A”.

bedFile.png

To double check, I extracted the the sequence of this region (chr10:3259345-3259372) in bed file as shown below. Comparing with the sequence (AAGGGGCTGGACTTGCATGCCATGGAT) in the sam file, we can know that the sequence indeed start from the second locus.

sequenceInGenome.png

 




https://blog.sciencenet.cn/blog-1113671-1172571.html

上一篇:Ensembl/Gencode数据库中基因注释统计
下一篇:sam文件小知识:正负链reads在sam文件中的序列信息
收藏 IP: 130.91.194.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-3-29 19:16

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部