osx - Applescript mystery: A scientific notation conversion subroutine -


i have bit of mystery on hands. found subroutine (http://macscripter.net/viewtopic.php?id=27520) convert scientific number string of digits. however, seems delete remaining digits, no matter try.

"1.23456789e+4" should become "12345.6789". instead, returns "12345".

try running following code , you'll see mean. call dialog disclose result:

set xx 1.23456789e+4 set yy number_to_string(xx) display dialog yy  on number_to_string(this_number)     set this_number this_number string     set deci character 2 of (0.5 text)     set x offset of deci in this_number     set z offset of "e" in this_number     if this_number contains "e+"         set y offset of "+" in this_number         set decimal_adjust characters (y - (length of this_number)) thru ¬             -1 of this_number string number         if x not 0             set first_part characters 1 thru (x - 1) of this_number string         else             set first_part ""         end if         set second_part characters (x + 1) thru (z - 1) of this_number string         set converted_number first_part         repeat 1 decimal_adjust             try                 set converted_number ¬                     converted_number & character of second_part             on error                 set converted_number converted_number & "0"             end try         end repeat         return converted_number     else         if this_number contains "e-"             set y offset of "-" in this_number             if x not 0                 set first_part text 1 thru (x - 1) of this_number             else                 set first_part ""             end if             set second_part text (x + 1) thru (z - 1) of this_number             set converted_number first_part & second_part             set n text (y + 1) thru -1 of this_number number             set 0 "0."              if n > 1                 repeat (n - 1) times                     set 0 zero & "0"                 end repeat             end if             set converted_number 0 & converted_number         else             set converted_number this_number         end if     end if      return converted_number end number_to_string 


as why applescript code didn't work:

your code pays attention exponent, , not number of digits in mantissa (the fractional part of number before exponent).

thus, 1.23456789e+4 input, strictly 4 digits of mantissa extracted form result, irrespective of how many digits mantissa has: 1 & first 4 digits of 2345678, yields 12345.


getting right in applescript nontrivial, i suggest using do shell script shell command using bc, posix arbitrary-precision calculation utility, there more quickly:

set xx "1.23456789e+4" # define *string* avoid rounding errors  # perform transformation via handler defined below. set yy todecfraction(xx)  display alert yy  on todecfraction(numstr)      local maxdecplaces      # *negative* exponents: set maximum number of decimal places in result.     # *positive* exponents: number of decimal places in result      # automatically chosen accommodate digits.     # in either case: max. number of decimal places supported 2,147,483,647.     set maxdecplaces 32      shell script "{ printf 'scale=" & maxdecplaces & ¬         "; '; sed -e 's/[ee]\\+?/*10^/g' <<<" & quoted form of (numstr text) & "; } |            bc | tr -d '\\n\\' |              sed -e -e '/\\./!b' -e 's/\\.0+$//;t' -e 's/0+$//; s/^(-?)(\\.)/\\10\\2/'"  end todecfraction 
  • printf 'scale=<n>;', when piped bc, instructs use precision of <n> decimal places in case of negative exponent; if exponent positive, bc automatically picks precision preserves digits.
    • the upper limit number of decimal place hypothetical 2,147,483,647(!) (2^32/2-1), note higher number choose maxdecplaces (in case of negative exponent) or more decimal places input has (in case of positive exponent), longer conversion take, though in practice there little difference in performance between limit of, say, 32 vs. 200(!) decimal places. note truncation, not rounding occurs if limit low.
    • it is possible calculate exact number of decimal places needed preserve digits, requires non-trivial lexical analysis, choosing high-enough upper bound pragmatic compromise.
  • sed -e 's/[ee]\+?/*10^/g'' reformats scientific notation equivalent arithmetic expression bc can evaluate; e.g.:
    • 1e2 -> 1*10^2
    • .3e+1 -> .3*10^1
    • 2.5e-2 -> 2.5*10^-2
  • passing expression bc prints result as decimal fraction many decimal places implied input (in case of positive exponent), or specified via variable scale (in case of negative exponent)
  • tr -d '\n\' needed remove \ chars. , newlines bc inserts when outputting numbers more 70 characters long.
  • sed -e -e '/\\./!b' -e 's/\\.0+$//;t' -e 's/0+$//; s/^(-?)(\\.)/\\10\\2/' cleans result removes trailing zeros result (and removes decimal point, if no decimal places left), prepends 0, if (absolute value of) result < 1.

note:

  • if integer portion of result 0, is printed, so, instance, 1e-2 printed 0.01, normal in applescript - not .01.
    • if not want leading 0, replace -e 's/0+$//; s/^(-?)(\\.)/\\10\\2/' in code above -e 's/0+$//'.
  • bc design not locale-aware, radix character ("decimal point") expects on input , produces on output always .

for comparison, here handler uses pure bash code perform transformation lexically - can see, effort of rolling one's own transformation nontrivial - , more verbose in applescript.

in practice, the 2 approaches perform same. advantage of solution there's no limit on number of decimal places, , digits automatically preserved , unrecognized number strings reliably raise error.

set xx "1.23456789e+4" # define *string* avoid rounding errors  # perform transformation via handler defined below. set yy todecfraction(xx)  display alert yy  # synopsis #   todecfraction(numstring) # description #   textually reformats specified number string decimal exponential (scientific) notation #   (e.g., 1.234e+2) decimal fraction (e.g., 123.4). #   leading , trailing whitespace acceptable. #   input in integer form or decimal fraction accepted, , echoed *unmodified*. #   no fractional part output if there none; e.g., '1.2e1' results in '12'. #   numbers integer part of 0 output leading 0 (e.g. '0.1', not '.1') #   unrecognized number strings result in error. #   there no limit on number of decimal places , there no rounding errors, given #   transformation purely *lexical*. #   note: function not locale-aware: '.' must used radix character. # examples #   todecfraction('1.234567e+2') # -> '123.4567' #   todecfraction(todecfraction '+1e-3') # -> '0.001' #   todecfraction('-1.23e+3') # -> '-1230' #   todecfraction ('1e-1') # -> '0.01' on todecfraction(numstr)     try         shell script " todecfraction() {   local numstr leadingzero sign intpart fractpart expsign exponent alldigits intdigitcount intdigits fractdigits padcount result   { [[ $1 == '--' ]] && shift; } || { [[ $1 == '-z' ]] && { leadingzero=1; shift; } }   read -r numstr <<<\"$1\" # trim leading , trailing whitespace   # parse constituent parts , fail, if not recognized decimal integer / exponential notation.   [[ $numstr =~ ^([+-]?)([[:digit:]]+)?\\.?(([[:digit:]]+)?([ee]([+-]?)([[:digit:]]+))?)?$ ]] || return 1   sign=${bash_rematch[1]} intpart=${bash_rematch[2]}   fractpart=${bash_rematch[4]} expsign=${bash_rematch[6]} exponent=${bash_rematch[7]}   # if there's neither integer nor fractional part, fail.   [[ -n $intpart || -n $fractpart ]] || return 1   # debugging: echo \"[$sign][$intpart].[$fractpart]e[$expsign][$exponent]\"   # if there's no exponent involved, output number    # (it either integer or decimal fraction.)   [[ -n $exponent ]] || { echo \"$1\"; return 0; }   alldigits=${intpart}${fractpart}   # calculate number of integer digits in resulting decimal fraction,   # after resolving exponent.   intdigitcount=$(( ${#intpart} + ${expsign}${exponent} ))   # if sign explicit +, set empty string - don't want output it.   [[ $sign == '+' ]] && sign=''   if (( intdigitcount > 0 )); # @ least 1 integer digit     intdigits=${alldigits:0:intdigitcount}     padcount=$(( intdigitcount - ${#intdigits} ))     (( padcount > 0 )) && intdigits=${intdigits}$(printf \"%${padcount}s\" | tr ' ' '0')     fractdigits=${alldigits:intdigitcount} # determine goes after radix character     result=${sign}${intdigits}${fractdigits:+.}${fractdigits}     # remove leading zeros, if any.     [[ $result =~ ^0+([^0].*)?$ ]] && result=\"${bash_rematch[1]}\"   else # result < 1     padcount=$(( -intdigitcount ))     result=${sign}${leadingzero:+0}.$(printf \"%${padcount}s\" | tr ' ' '0')${intpart}${fractpart}   fi   # trim empty fractional part, , ensure if   # result empty, '0' output.   [[ $result =~ ^([^.]*)\\.0+$ ]] && result=\"${bash_rematch[1]}\"   printf '%s\\n' \"${result:-0}\" } todecfraction -z " & quoted form of (numstr text)     on error number errnum         error "not recognized number: " & (numstr text) number (500 + errnum)     end try end todecfraction 

here's embedded bash function proper syntax highlighting:

todecfraction() {   local numstr leadingzero sign intpart fractpart expsign exponent alldigits intdigitcount intdigits fractdigits padcount result   { [[ $1 == '--' ]] && shift; } || { [[ $1 == '-z' ]] && { leadingzero=1; shift; } }   read -r numstr <<<"$1" # trim leading , trailing whitespace   # parse constituent parts , fail, if not recognized decimal integer / exponential notation.   [[ $numstr =~ ^([+-]?)([[:digit:]]+)?\.?(([[:digit:]]+)?([ee]([+-]?)([[:digit:]]+))?)?$ ]] || return 1   sign=${bash_rematch[1]} intpart=${bash_rematch[2]}   fractpart=${bash_rematch[4]} expsign=${bash_rematch[6]} exponent=${bash_rematch[7]}   # if there's neither integer nor fractional part, fail.   [[ -n $intpart || -n $fractpart ]] || return 1   # debugging: echo "[$sign][$intpart].[$fractpart]e[$expsign][$exponent]"   # if there's no exponent involved, output number    # (it either integer or decimal fraction.)   [[ -n $exponent ]] || { echo "$1"; return 0; }   alldigits=${intpart}${fractpart}   # calculate number of integer digits in resulting decimal fraction,   # after resolving exponent.   intdigitcount=$(( ${#intpart} + ${expsign}${exponent} ))   # if sign explicit +, set empty string - don't want output it.   [[ $sign == '+' ]] && sign=''   if (( intdigitcount > 0 )); # @ least 1 integer digit     intdigits=${alldigits:0:intdigitcount}     padcount=$(( intdigitcount - ${#intdigits} ))     (( padcount > 0 )) && intdigits=${intdigits}$(printf "%${padcount}s" | tr ' ' '0')     fractdigits=${alldigits:intdigitcount} # determine goes after radix character     result=${sign}${intdigits}${fractdigits:+.}${fractdigits}     # remove leading zeros, if any.     [[ $result =~ ^0+([^0].*)?$ ]] && result="${bash_rematch[1]}"   else # result < 1     padcount=$(( -intdigitcount ))     result=${sign}${leadingzero:+0}.$(printf "%${padcount}s" | tr ' ' '0')${intpart}${fractpart}   fi   # trim empty fractional part, , ensure if   # result empty, '0' output.   [[ $result =~ ^([^.]*)\.0+$ ]] && result="${bash_rematch[1]}"   printf '%s\n' "${result:-0}" } 

finally, here even simpler shell command, which, however, not recommended, because subject inherent rounding errors of double-precision floating-point values, you cannot guarantee digits (faithfully) preserved.:

set xx "1.23456789e+4"  set yy shell script "awk -v n=" & quoted form of (xx text) & " 'begin \\ { convfmt=\"%.11f\"; ns=\"\"(n + 0); if (ns ~ /\\./) gsub(\"0+$\",\"\",ns); print ns }'"  display alert yy 

the command uses awk's native ability recognize scientific notation, , converts resulting number string using (implicitly applied) printf number format "%.11f" - i.e., 11 decimal places; trailing zeros trimmed (with gsub()) before result returned.

at first glance, appears fine: result 12345.6789. however, if change number of decimal places 12 (convfmt=\"%.12f\"), rounding error creeps in: 12345.678900000001(!)

you won't know in advance when happens, if faithful preservation of digits required, approach not viable.


Comments

Popular posts from this blog

java - Spring Data JPA: Why findOne(id) executing delete query internally? -

python - Mongodb How to add addtional information when aggregating? -

java - Incorrect order of records in M-M relationship in hibernate -