How to Determine the GC‐Content of a DNA Sequence
Create or accept an input file., Read in the file., Create a counter., Divide the GC count by the total length of the sequence, and output the result in percentage format.
Step-by-Step Guide
-
Step 1: Create or accept an input file.
This article assumes that the input is in FASTA format, with a single sequence per file., For FASTA format:
Discard the first line of the file.
Remove all remaining newlines and other trailing whitespace. def init(sequence): with open(argv) as input: sequence = "".join(]) return sequence , Iterate through the data and increment your counter as you encounter any guanine or cytosine nucleotides. def GCcontent(sequence):
GCcount = 0 for letter in sequence: if letter == "G" or letter == "C":
GCcount += 1 return GCcount , def main(): script, input = argv sequence = "" sequence = init(sequence) print "%.2f" % (float(GCcontent(sequence)) / len(sequence)) -
Step 2: Read in the file.
-
Step 3: Create a counter.
-
Step 4: Divide the GC count by the total length of the sequence
-
Step 5: and output the result in percentage format.
Detailed Guide
This article assumes that the input is in FASTA format, with a single sequence per file., For FASTA format:
Discard the first line of the file.
Remove all remaining newlines and other trailing whitespace. def init(sequence): with open(argv) as input: sequence = "".join(]) return sequence , Iterate through the data and increment your counter as you encounter any guanine or cytosine nucleotides. def GCcontent(sequence):
GCcount = 0 for letter in sequence: if letter == "G" or letter == "C":
GCcount += 1 return GCcount , def main(): script, input = argv sequence = "" sequence = init(sequence) print "%.2f" % (float(GCcontent(sequence)) / len(sequence))
About the Author
Zachary Ramirez
Dedicated to helping readers learn new skills in DIY projects and beyond.
Rate This Guide
How helpful was this guide? Click to rate: