complex regex matches in python -


i have txt file contains following data:

chri

atgccttgggcaacggt...(multiple lines)

chrii

aggttggccaaggtt...(multiple lines)

i want first find 'chri' , iterate through multiple lines of atgc until find xth char. want print xth char until yth char. have been using regex once have located line containing chri, don't know how continue iterating find xth char.

here code:

for i, line in enumerate(sacc_gff):     match in re.finditer(chromo_val, line):         print(line)         match in re.finditer(r"[atgc]{%d},{%d}\z" % (int(amino_start), int(amino_end)), line):             print(match.group()) 

what variables mean:

chromo_val = chri

amino_start = (some start point program found)

amino_end = (some end point program found)

note: amino_start , amino_end need in variable form.

please let me know if clarify you, thank you.

it looks working fasta data, provide answer in mind, if isn't can use sub_sequence selection part still.

fasta_data = {} # creates empty dictionary open( fasta_file, 'r' ) fh:     line in fh:         if line[0] == '>':             seq_id = line.rstrip()[1:] # strip newline character , remove leading '>' character             fasta_data[seq_id] = ''         else:             fasta_data[seq_id] += line.rstrip()  # return substring chromosome 'chri' first character @ amino_start not including amino_end sequence_string1 = fasta_data['chri'][amino_start:amino_end] # return substring chromosome 'chrii' first character @ amino_start , including amino_end sequence_string2 = fasta_data['chrii'][amino_start:amino_end+1] 

fasta format:

>chr1 atttatatatat atggcgcgatcg >chr2 aatcgctgctgc 

Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -