使用php / mysql解析EDI文件

I have a 40K line EDI (fixed length) file that I must parse into a csv. If you know edi you know that each element has its own fixed length.

Im new to parsing EDI and just need a bit of help. My first thought is to set up a table that will hold the element lengths either as an array or as such

Table EDIInfo
EDI_ID           |  EDI_ElemLengths

1                |  3,22,7s2,30,30,22
2                |  30,5s2,9s2,3,1,23

** the s in the array above denotes a decimal ofter the second place from right.

So once I get this data into the db, Im not sure how to pull it out to apply it to the file that I have located on my server. The file is extensionless at this point, its a simple text file. Id like to parse it into a new file XXXX.csv in the same directly.

Any links to tuts or any help/direction would be greatly appreciated.

If you dont know EDI, its basically a text file with a "record" on each line composed of "elements". Each "element" is allowed a fixed number of characters on that line even if it does not take up all its allotted space. An element is similar to a field (like a field is defined such as varchar 64, an element is defined by the amount of spaces it is allowed to use in the text file). Elements bump up to one another, there are no delimiters outside of the element space allotment.

Thank you

EDI elements are not "fixed length" by the traditional definition. Not sure where you got that idea. Your statement: "If you know edi you know that each element has its own fixed length." is false. Your statement: "its basically a text file with a "record" on each line composed of "elements" is also incorrect. If your segment terminator is a CR or LF, your text editor will render it as a segment per line. What if your segment terminator was a tilde (~)? Then your file would be a text stream.

Per the EDI dictionary, an element can have a min / max value. If an element has a min 4 / max 8, the element is variable length, as it does not pad out to the full 8 characters. EDI is a structured, delimited file. The only fixed length segment is the ISA (in ANSI X12)

If you're working with ANSI X12, there are three delimiters: segment, element and subelement. You can find them by parsing the ISA segment. Once you have the delimiters, you can parse the rest of the file. If you're parsing by delimiters, the only time you have to worry about element length is if you're syntax checking against the standards dictionary - something you probably aren't interested in doing.

If you're working with EDIFACT, the same general idea applies (you get the delimiters from the enveloping, but there can be six delimiters). I am only assuming you are working with ANSI X12.

There are tons of parsers out there. You are reinventing the wheel. Existing parsers probably even have FA generation, and communication tools built in. If you're looking at a lot of raw EDI data and need context as to what the data means, look at this free EDI Notepad tool: http://liaison.com/products/integrate/edi-notepad