Question:
String manipulation in Unix Shell Script?
Jason
2006-07-27 10:22:39 UTC
Hello, I have a list of numbers stored in a file

123
12
12345
1234
1
123

I need them all to be the same number of digits (say 7 digits) with preceeding 0's.

0000123
0000012
0012345
0001234
0000001
0000123

I have done this script using while loops to place zeros until proper number of digits and I have also hard-coded in a case test for the proper string to concatanate to the front of each line.
The problem is, I have a table of 3 columns and 40,000 lines long... it takes too long with all the approaches I've tried.
Is there an easier approach?
I've been looking into sed lately without much luck.

Any help would be wonderful!
Five answers:
BalRog
2006-07-27 11:24:33 UTC
At the risk of starting a religious war, I would suggest trying some variant of awk (awk, nawk, gawk, or whatever is on your system).



For a CSV file you could try the following command (assumes the 2nd field ($2) is the one you need to change):



$ awk -F, -v OFS=, \

'{ $2=( substr( "0000000", 1, 7-length( $2 ) ) $2 ); print; }' \

varLength.csv >length7.csv



The "-F," option sets the input field separator to be ",". The "-v OFS=," option does the same for the output field separator. The long single-quoted string is the "awk program" (Sorry, I don't have time to explain the program. Go find a book on awk.).



The religious war part is that Perl proponents will claim that anything Awk can do Perl can do better and faster. I don't know Perl so I have no basis to judge the claim. This Awk program should at least be much faster than your shell script loop.



If you *really* need speed you can look into "lex", but that requires at least some rudimentary understanding of C programming.
califf
2016-11-05 01:43:50 UTC
Unix String Manipulation
Charles G
2006-07-27 10:38:06 UTC
If perl is installed you can do something like this



perl -e 'print sprintf("%0.7d","1234");'



Here is a perl script that does most of the work.



#!/usr/bin/perl



my $file = $ARGV[0];

my $tmpfile = $file.$$;

open(IN,$file);

open(OUT,">$tmpfile");



while($line = ) {

chomp($line); # removes linefeed



# Assuming tab seperated columns

my ($col1, $col2, $col3) = split(/\t/,$line);



# Assuming the first $col1 is the column for the numbers.

print OUT sprintf("%0.7d\t%s\t%s",$col1, $col2, $col3), "\n";

}



close(OUT);

close(IN);



# You can then copy the $tmpfile to the original filename $file
Arthur
2006-07-27 10:38:13 UTC
paste the strings into excel, format the cells with custom data type of 00#####00000



You can get the results promptly!
Blues Man
2006-07-27 10:36:04 UTC
Examine the tools grep and sed. I can't go into details but those guys can help you.


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...