Question:
Extracting data from pdb file in Perl?
2011-12-22 14:57:39 UTC
The task is to create a program that reads and collects data from a pdb file, then display data when the user requests. The program should have 1 command line argument which is the pdb file, it should read the file and display the number of atoms in the file. Then it should prompt for a command from the user, the user should be able to enter 'freq' 'length'and 'quit'. If the user enters freq, the program should output all the types of atoms in the pdb file and their frequency, such as C: 3201 P: 42 N: 918 S: 23 O: 1101 If the user enters length the program should display the greatest distance between any two atoms in the file. I know that the data I'll need to extract from the pdb file include the atom abbreviation (Na, Ca, etc.) and the atom coordinates to calculate the distances. Atom abbreviations are in columns 77-78 of the file, where a column is one character. X coordinates are in columns 31-38, y coordinates are in 39-46 and z coordinates are in columns 47-54. Any at all help or advice on how to go about writing this program would be much appreciated, thanks.
Three answers:
Takanori Kawai
2011-12-22 16:18:27 UTC
How many lines does your pdb file have?

I'm not sure what pdf is. It might be "Recod: ATOM" of this page.

http://deposit.rcsb.org/adit/docs/pdb_atom_format.html



How about this? But it takes long time for "length" if your pdb file has a lot of lines.



use strict;

use warnings;

die "Specify PDB!" if(@ARGV < 1);

my $pdb = $ARGV[0];

die "Not Readable $pdb" unless(-r $pdb);



sub getCmd{ print 'Command: '; my $cmd = ; chomp($cmd); lc($cmd);}

while ( my $cmd = getCmd())

{

exit if($cmd eq 'quit');

if($cmd eq 'freq')

{

my %atom;

open my $fh, '<', $pdb;

while(my $rec = <$fh>)

{

my $wAtom = substr($rec, 76, 2);

$wAtom =~ s/^ //;

$atom{$wAtom} ||= 0;

++$atom{$wAtom}

}

close $fh;

foreach my $wAtom (keys %atom)

{

print "$wAtom\t$atom{$wAtom}\n";

}

}

elsif($cmd eq 'length')

{

my @pos;

my %max;

my $line = 0;

$max{length} = 0;

open my $fh, '<', $pdb;

while(my $rec = <$fh>)

{

++$line;

my ($x, $y, $z) = (substr($rec, 30, 8),

substr($rec, 38, 8),

substr($rec, 46, 8));

foreach my $raPos (@pos)

{

my($oLine, $oX, $oY, $oZ) = @$raPos;

my $wLength = sqrt(($oX - $x)**2 + ($oY - $y)**2 + ($oZ - $z)**2);

if($max{length} < $wLength)

{

$max{length} = $wLength;

$max{pos} = "($oLine, $line)";

}

}

push @pos, [$line, $x, $y, $z];

}

print "LENGTH: $max{length} at $max{pos}\n";

}

}



EDIT:

However if I were you, I would make a script taking some options like below.



use strict;

use warnings;

use Getopt::Long;

my %opt;

GetOptions('length!' => \$opt{length}, 'freq!' => \$opt{freq});

map { $opt{$_} = 1 unless(defined($opt{$_})); } keys(%opt);



my %atom;

my @pos;

my %max;

my $line = 0;

$max{length} = 0;

while(my $rec = <>)

{

if($opt{freq})

{

my $wAtom = substr($rec, 76, 2);

$wAtom =~ s/^ //;

$atom{$wAtom} ||= 0;

++$atom{$wAtom}

}

if($opt{length})

{

++$line;

my ($x, $y, $z) = (substr($rec, 30, 8),

substr($rec, 38, 8),

substr($rec, 46, 8));

foreach my $raPos (@pos)

{

my($oLine, $oX, $oY, $oZ) = @$raPos;

my $wLength = sqrt(($oX - $x)**2 + ($oY - $y)**2 + ($oZ - $z)**2);

if($max{length} < $wLength)

{

$max{length} = $wLength;

$max{pos} = "($oLine, $line)";

}

}

push @pos, [$line, $x, $y, $z];

}

}

if($opt{freq})

{

foreach my $wAtom (keys %atom)

{

print "$wAtom\t$atom{$wAtom}\n";

}

}

if($opt{length})

{

print "LENGTH: $max{length} at $max{pos}\n";

}
Grace
2016-02-28 09:04:25 UTC
Hint #1: if you want to "extract", use the m// operator rather than the s/// operator. Hint #2: if you want programming help, post the code you tried and we'll try to fix it for you. Hint #3: do NOT bother posting Perl code unless it has "use strict; use warnings;" at the top.
criton
2016-12-16 10:04:51 UTC
Adit Pdb


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...