Most Popular

1500 questions
4
votes
1 answer

How to use variables while calling pybedtools

I am trying to use bedtools and fetch sequences from a whole genome fasta file inside the script I get region coordinates as variables. For example: chr_ = "chr1" start = 3000 end = 3402 I am not sure how to wrap my function, because the example…
lizaveta
  • 203
  • 1
  • 3
4
votes
1 answer

Problems in creating desired phylogenetic tree with ggtree

I am working on haplotype data and want to make a tree out of haplogroups using ggtree. I have following data in newick…
4
votes
1 answer

Can I run STAR without an annotation file?

I wish to use Rascaf to scaffold a fragmented draft genome. For this, I need to provide a BAM file of aligned RNA-seq reads and the draft genome. So, I indexed the draft genome with STAR like this: STAR --runMode genomeGenerate --genomeDir…
Biomagician
  • 2,459
  • 16
  • 30
4
votes
1 answer

Raw vs Filtered in the output of cellranger count

After running cellranger count I got two relevant for further analysis folders: filtered_gene_bc_matrices and raw_gene_bc_matrices. What is the difference between them? What does cellranger filter out, and what should we preferably use for further…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
4
votes
1 answer

Why isn't there a standard method to convert GFA to JSON?

I'm confused why the GFA format doesn't have a standard JSON parser. It appears to be related to a decision made several years ago, here. The objection appears to be "JSON isn't necessary, as it's just a simple linked list. Users could write their…
EB2127
  • 1,413
  • 2
  • 10
  • 23
4
votes
2 answers

How to compare the contents of a column of the same data table

I have this table: and I want to get the rows that are equal to the first three columns, like this: I've tried these functions, but when I get the index of the lines, r doesn't give the output that I want: df$obj<-sapply(c("sample1", "sample2",…
Sofia
  • 351
  • 2
  • 7
4
votes
1 answer

Transform a R data.frame into a krona plot without krona tools?

KronaTools lacks a R pipeline to convert any data.frame into a krona plot. There is a way to go via a phyloseq-class object, but coming from a data.frame I did not find any documentation on how to transform these.
user5878028
4
votes
2 answers

Protein sequence from patient data

Currently, I am working on NGS data and my aim is to get significance prediction of variants present in the vcf file. As we know about SIFT Score for significance score prediction, I am trying to understand how this score works. When I read its…
Lot_to_learn
  • 530
  • 3
  • 14
4
votes
1 answer

Compute copy number from cases and controls

I have some data on Copy Number Variation (SNP chip) for a population of samples. In particular, I have a set of samples (considered as cases) which display a specific disease phenotype, and another set (considered as controls) which do not. The…
gc5
  • 1,783
  • 18
  • 32
4
votes
2 answers

What is cellranger doing in comparison to other methods?

I've recently started working with the 10X-Genomics platform with Illumina (MiSeq and HiSeq) for single-cell RNA-Seq. I've been recommended the "cellranger" (version 2.1.0) which I understand handles the barcoding of the platform and performs…
Tom Kelly
  • 873
  • 7
  • 20
4
votes
3 answers

Expression of a gene in different groups

I would like to check the expression of a gene in different groups like Disease vs Normal samples. I want to make a plot out of that to check whether it is significant or not. From this paper lncRNA I see that Figure 1C they used RPKM value. But,…
beginner
  • 631
  • 7
  • 15
4
votes
3 answers

Is spark widely used in bioinformatics?

I learned that GATK 4 is using Spark for parallelization. I googled around, though I am still not quite sure how spark really works and how to use it in practice. Besides GATK 4, are any other bioinformatics tool using spark? Generally, is spark…
medbe
  • 847
  • 1
  • 7
  • 9
4
votes
1 answer

Output of Seurat FindAllMarkers parameters

I compared two manually defined clusters using Seurat package function FindAllMarkers and got the output: Now, I am confused about three things: What are pct.1 and pct.2? How come p-adjusted values equal to 1? What does it mean? If we take first…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
4
votes
1 answer

get gene lines from gtf file

I would like to retrieve gene lines from a GTF file for which I only have exons & transcripts lines (output from Cufflinks) and alternative splicing possible. I need gene lines for compatibility with a pipeline dealing with GTF in Ensembl format.…
aechchiki
  • 2,676
  • 11
  • 34
4
votes
2 answers

What kind of analysis can be done with differential expression of transcription factors?

I have two different stem cell types and their respective gene expressions from RNAseq. I noticed that there is differential expression in some of the genes coding for histones as well as transcription factors? What kind of interesting downstream…
jaslibra
  • 524
  • 2
  • 9