Wednesday, October 26, 2005

 

sed dos2unix {Scanned}

Sed dos2unix

[user@host] # sed s/.$//g unixfile > unixfile.sed

Unfortunately, this also removes characters that you may not want removed (e.g. the "T" in "CDT"):

Another option uses sed again, but strips the specific character instead of the last character on each line:

[user@host] # sed s/^M//g unixfile > unixfile.sed2

One very important item to understand about this command is that the "^M" (control character) is not generated by typing the "^" character, and then the "M" character from your keyboard.  Instead, it is accomplished by typing Ctrl-V and then Ctrl-M (the Ctrl key and the V or M key are pressed simultaneously). Typing this sequence will produce the "^M" (control  character), which allows sed to locate and process it as instructed.

The most desirable is running the dos2unix utility against the file:

[user@host] # dos2unix unixfile unixfile.dos2unix

####################################################################################################

 

Convert dos text files to unix, and vice versa:

dos2unix file.txt
unix2dos file.txt
tr -d '\015' < win.txt > unix.txt  # if you can't find dos2unix
sed -e 's/$/\r/' < unix.txt > win.txt  # if you can't find unix2dos

 

####################################################################################################

 

With vi

 

Notice that some programs are not consistent in the way they insert the line breaks so you end up with some lines that have both a carrage return and a ^M and some lines that have a ^M and no carrage return (and so blend into one). There are two steps to clean this up.

1. replace all extraneous ^M:

:%s/^M$//g

BE SURE YOU MAKE the ^M USING "CTRL-V CTRL-M" NOT BY TYPING "CARROT M"! This expression will replace all the ^M's that have carriage returns after them with nothing. (The dollar ties the search to the end of a line)

2. replace all ^M's that need to have carriage returns:

:%s/^M/ /g

Once again: BE SURE YOU MAKE the ^M USING "CTRL-V CTRL-M" NOT BY TYPING "CARROT M"! This expression will replace all the ^M's that didn't have carriage returns after them with a carriage return.

 

It also works with
:%s/\r//g

 

 

I think using this command is easier.
    :set ff=unix  //to unix file
    :set ff=dos   //to windows file

 

Or

:set fileformat=dos
:set fileformat=unix

 

 

with:
:%s/^M/\r/g
works perfectly !!!

 

 

####################################################################################################

 

Quick Script

 

#!/bin/bash
# To replace dos linebreaks for Unix compatibility

echo "This script will replace the ^M line breaks from dos."

echo -n "Enter filename without extension: "
read file
echo -n "Enter extension: "
read ext
sed 's/\r//' $file.$ext > $file2.$ext
cp -f $file2.$ext $file.$ext
rm -f $file2.$ext

 

This script is the same as before, just minus one step.
#!/bin/bash
# To replace dos linebreaks for Unix compatibility
echo "This script will replace the ^M line breaks from dos."
echo -n "Enter filename: "
read file
sed 's/\r//' $file > 2$file
cp -f 2$file $file
rm -f 2$file

 

Heres another little script
#!/bin/sh
FILE="$1"
# Use sed with the -i command line for inline interpreting.
sed -i '' "s/\r//g" $FILE

 

####################################################################################################

 

Just trim

 

From the UNIX shell: tr -d "\015" < inputfile > outputfile

E.g.: tr -d "\015" < dosformatted.txt > unixformatted.txt

 

####################################################################################################

 

Lets replace it with a new line!

 

sed "s/^M/\n/g"  replaces the ^M with a linux newline.
the ^M is written Ctrl-V Ctrl-M, not with the carrot char.

 

 

####################################################################################################

 

 

to address the problem of ^M ( <ctrl>M ) characters in multiple files
the following single line SHell command will be helpful  

   for name in `ls *.dat` ; do sed 's/^M//' $name > ${name/\.dat/N\.dat}  ;  mv ${name/\.dat/N\.dat} $name ; done

 

####################################################################################################

 

Now in C

 

/* A program to strip control characters */


#include <stdio.h>

FILE    *in,*tmp;

int main(int argc, char *argv[])
{
    int index,count;
    unsigned char byte;
    printf("Hello! Are you ready to get rid of those nasty crlf?\n");
    if(argc<2) {
        printf("You need to specify an input file\n");
        return 1;
    }
    if((tmp = fopen("tmp.tmp","wb"))==0){
        printf("We could not open the temportary file called tmp.tmp\n");
        return 2;
    }
    if((in = fopen(argv[1],"rb"))==0){
        printf("We could not open the input file called %s\n",argv[1]);
        return 3;
    }
    do{
        count = fread(&byte,1,1,in);
        if(count == 1){
            if(byte!=0x0d) fwrite(&byte,1,1,tmp);
        }
    }while(count==1);
    fclose(tmp);
    fclose(in);
    rename("tmp.tmp",argv[1]);
    return 0;
}

 

####################################################################################################

Sed Again and Again…

# Under UNIX: convert DOS newlines (CR/LF) to Unix format

bash$ sed 's/.$//' file    # assumes that all lines end with CR/LF
bash$ sed 's/^M$// file    # in bash/tcsh, press Ctrl-V then Ctrl-M
 # Under DOS: convert Unix newlines (LF) to DOS format
C:\> sed 's/$//' file    # method 1
C:\> sed -n p file       # method 2

 

And trim one more time…

 

tr -d [^M] < inputfile > outputfile

 

####################################################################################################

 

Now in Perl

One Command Line

The simplest perl script is this one: perl -pi -e 's/\r\n/\n/;' *.java

This does the reverse: perl -pi -e 's/\n/\r\n/;' *.java

Two Lines

If you wish to be a little more complicated, you can do the same in two lines of perl. This enables you to simply name the file(s) you wish to convert on the command line. It would be used like so: dos2unix-2line *.java

Here is what dos2unix-2line it looks like:

#!/usr/bin/perl -pi
s/\r\n/\n/;
  

Here is what unix2dos-2line it looks like:

#!/usr/bin/perl -pi
s/\n/\r\n/;
  

 

More perl…

#!/bin/sh
perl -p -i -e 'BEGIN { print "Converting DOS to UNIX.\n" ; } END { print "Done.\n" ; } s/\r\n$/\n/' $*
 
 

 


Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?