Most Popular

1500 questions
3
votes
2 answers

What are RIKEN genes?

I am using data from the Tabula Muris Consortium. These are gene counts from scRNA-seq of mouse cells. There are some specific genes with name ending with suffix 'Rik' (e.g. 0610005C13Rik or 0610007C21Rik). Searching online I found that they are…
gc5
  • 1,783
  • 18
  • 32
3
votes
1 answer

Can the canu assembler output a fastq file of the final assembly just like HGAP4?

I have assembled some genome from Sequel PacBio data both with HGAP4 on the SMRT Link interface and using canu on the command line. The HGAP4 assembler outputs a fastq file of the final assembly such that I can see how reliable the sequence is along…
Biomagician
  • 2,459
  • 16
  • 30
3
votes
1 answer

How do I validate a single sample ArrayCGH result?

We have arrayCGH (aCGH) results for one sample. There is a 0.5 Mb terminal duplication on chromosome 19 (62995490-63407936, according to NCBI36/hg18). The duplication is rare: a literature review suggests there are only 3-4 samples with clinical…
zx8754
  • 1,042
  • 8
  • 22
3
votes
1 answer

R: 'Matrix' can not be unloaded, but 'writeMM' method not found

I want to save a sparse and very large dgcMatrix onto disc. I read that it could be done with writeMM method. So, when I am trying to do that: writeMM(UMI_count, "/home/gene_count_filtered/filtered_dataset") Rstudio is giving me an error: Error in…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
3
votes
1 answer

Large dataset normalization for PCA

I need to normalize a large (400Mb) dataset for doing PCA analysis. I want to use scran for doing that: biocLite("SingleCellExperiment") biocLite("scran") library(SingleCellExperiment) library(scran) list_of_sce <- list() # Looping though the…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
3
votes
5 answers

Import a tab-separated file with differing numbers of elements in each row; prokka output

I am using prokka to annotate a bacterial genome: prokka ecoli.fa Prokka is outputting a tab-separated file (called PROKKA_12142017.tsv) with differing numbers of elements in each row: locus_tag ftype gene EC_number …
Biomagician
  • 2,459
  • 16
  • 30
3
votes
1 answer

sorting BAM file error using samtools

I have few bam files and would like to get read counts using samtools idxstats [Data is aligned to hg19 transcriptome]. To use that command I need a sorted bam file. So to sort them I gave the following command. samtools sort -T /tmp/input.sorted…
stack_learner
  • 1,262
  • 14
  • 26
3
votes
2 answers

Protein fold pathfinding?

This is probably a long-shot since it would be so dependent upon the underlying folding algorithms. Are there known algorithms for mapping the path followed in a protein folding simulation when you have the starting conformation and the solution…
CoryG
  • 195
  • 1
  • 3
3
votes
1 answer

How to use the hmmsearch on prototypic repetitive sequences Repbase Update database?

I want to use the hmmsearch proposed in Convergence of retrotransposons in oomycetes and plants by Kirill Ustyantsev, Alexandr Blinov and Georgy Smyshlyaev. After the help from terdon I managed to create the correct profile I assumed that the…
A.Dumas
  • 497
  • 3
  • 9
3
votes
2 answers

scater: SingleCellExperiment, Error in seq_len(ncol(assay))

I am trying to go through the following scRNA-seq tutorial. But the line sce <- newSCESet(countData=all.counts) is not working anymore with the most up-to-date version of scater. Now we should use SingleCellExperiment function instead, so when…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
3
votes
1 answer

What are the differences between GWAS and QTL mapping?

What are the differences between the two methods? What advantages does one have over the other, and what are their limitations relative to one another?
jhurst5
  • 69
  • 1
  • 5
3
votes
1 answer

Folded Protein Chunk Dimensional Classification?

Are there known dimensional measurements for the classification of folded proteins given a starting chunk/domain as defined by something like the clustering functionality of MSM Builder? Examples of what I would be looking for would be dimensions…
CoryG
  • 195
  • 1
  • 3
3
votes
0 answers

Trying to show the gaps of each seq on bio::Graphics after converting clustalw

I want a box representing each sequence, positioned as they are in the alignment and with gaps shown as breaks in the each box. I've been having trouble for a while with this and have been trying to get this on my own, but I can't. I'm trying to…
3
votes
2 answers

NCBI - edirect suite to download all genome sequences associated with a query - troubleshooting

Not sure if this is allowed but I cannot think of a better place to ask. I am attempting to download all genomic sequences from refseq associated with the query Peptostreptococcaceae. I have been been using the following command (based on some…
AudileF
  • 955
  • 8
  • 25
3
votes
1 answer

How can I apply proportional (p) distances (Nucleotide) using bioPython

I am working on phylogenetic tree generation. I used BioPython and ClustalW 2.x.x for this purpose. I have generate the tree using BioPython but when I try to generate tree using "MegaSoftware GUI" tool my tree does not matched with the output of…