5

I have a lot of FASTA files that I recently scaffolded with RagTag. The output of RagTag includes a directory for each sample and the corresponding FASTA file written as "ragtag.scaffold.fasta"

I would like to go through each file and rename it like so:

Original Path with File name:
directoryName_ID/ragtag.scaffold.fasta

Changed File name: directoryName_ID/ID_ragtag.scaffold.fasta

As you can see, the directory name includes the ID number I would like to use to rename the fasta file with (along with an underscore to space it out). Is there a way to loop through all of the directories, using that last number after the "_" in the directory name and to use that to rename my fasta file?

Here is what I have tried so far:

for filename in ./directory*/ragtag.scaffold.fasta; do 
    mv filename ${./directory%*_*}_ragtag.scaffold.fasta; 
done

It doesn't work at all but I know I need to use mv to rename the filename. However, when it comes to referencing the directory name the file is located in I get lost. I also have a .txt file that contains a list of IDs if that might be easier to reference. I'm still kinda new to the command line so if anyone could help that would be greatly appreciated!

terdon
  • 10,071
  • 5
  • 22
  • 48
rimo
  • 963
  • 1
  • 15

4 Answers4

3

Just execute the following immediately above the directories of interest, (post edited to include OPs "constant term")

for d in *; do 
  mv "$d"/ragtag.scaffold.fasta 
 "$d/${d/000mergedCluster_/}_ragtag.scaffold.fasta";
done

The key thing I assumed is that the directory name comprises a "constant term" and is suffixed by a "variable term".

Just to note the $d here is directory not file name - because that is constant in each directory.


Comments...

$d/${d/000mergedCluster_/}

The key part of the code is above.

What its doing is saying from the variable $d delete the term 000mergedCluster_ (or whatever the "constant term" is)

terdon
  • 10,071
  • 5
  • 22
  • 48
M__
  • 12,263
  • 5
  • 28
  • 47
  • 2
    Thank you this works great! Only thing is that it includes the "constant term" in the new renamed file...is there a way to change it so only the "variable term" (whats after the _ in the original file name) is added to the file name? – rimo Jan 17 '23 at 21:55
  • @rimo What is your "constant term" in the directory name? – M__ Jan 17 '23 at 21:59
  • 1
    The whole name is something like "000mergedCluster_1234" so the constant term is "000mergedCluster_" and the variable term is "1234" – rimo Jan 17 '23 at 22:00
  • @rimo Can you try it now? If you copy and pasted the edited code that should work – M__ Jan 17 '23 at 22:04
  • 1
    She's perfect! Sorry that was silly of me but I completely understand what this command is doing now. Thank you so much for your help!! This works great! – rimo Jan 17 '23 at 22:06
  • 2
    The asterisk wildcard by itself will match all files and directories (except dot files/folders). There's no guarantee that $d is actually a directory. Better to use a more specific pattern, for example: for d in 000mergedCluster_*. – Steve Jan 18 '23 at 00:18
  • @Steve agreed. That should have been updated, but at the time the directory name was not known, however for d in *_* would have been better. – M__ Jan 18 '23 at 01:25
3

Assuming the output is only a list of directories and each directory includes a file called "ragtag.scaffold.fasta", you could use brace expansion to generate the SOURCE and DEST strings:

for i in directoryName_* ; do
    echo mv "${i}"/{,"${i#directoryName_}"_}ragtag.scaffold.fasta
done

But the above will only work using Bash and Zsh, and maybe some other shells. Without Bash/Zsh, just use:

fasta="ragtag.scaffold.fasta"
for i in directoryName_* ; do
    echo mv "${i}/${fasta}" "${i}/${i#directoryName_}_${fasta}"
done

Remove the echo to perform the rename.

Steve
  • 3,099
  • 1
  • 4
  • 12
1

If you want to use Perl:

#!/usr/bin/perl
use 5.030;
use warnings;

use File::Find::Rule qw( );

for my $file (File::Find::Rule->name("ragtag.scaffold.fasta")->in(".")){ my $newfile = $file; $newfile =~ s{(.+(\d+)/)(.+.fasta)}{$1$2$3};

if (-e $newfile) { warn "can't rename $file to $newfile: $newfile exists\n"; }elsif (rename $file, $newfile) { # do nothing }else{ warn "rename $file to $newfile failed: $!\n"; } }

Supertech
  • 606
  • 2
  • 10
1

Perl one-liner:

ls */filename | perl -lane 'print "mv $1/$3 $1/$2_$3" if /(\S+_([^\/_\s]+))\/(\S+)/'
user172818
  • 6,515
  • 2
  • 13
  • 29