Mercurial Hosting > luan
comparison website/src/manual.html.luan @ 1716:b82767112d8e
add String.regex
| author | Franklin Schmidt <fschmidt@gmail.com> |
|---|---|
| date | Sun, 24 Jul 2022 23:43:03 -0600 |
| parents | 19df8abc9805 |
| children | c637a2a1023d |
comparison
equal
deleted
inserted
replaced
| 1715:ad44e849c60c | 1716:b82767112d8e |
|---|---|
| 2413 to its corresponding argument. | 2413 to its corresponding argument. |
| 2414 </p> | 2414 </p> |
| 2415 <% | 2415 <% |
| 2416 end | 2416 end |
| 2417 } | 2417 } |
| 2418 ["String.contains"] = { | |
| 2419 title = "<code>String.contains (s, s2)</code>" | |
| 2420 content = function() | |
| 2421 %> | |
| 2422 <p> | |
| 2423 Returns a boolean indicating whether the <code>s</code> contains <code>s2</code>. | |
| 2424 </p> | |
| 2425 <% | |
| 2426 end | |
| 2427 } | |
| 2418 ["String.encode"] = { | 2428 ["String.encode"] = { |
| 2419 title = "<code>String.encode (s)</code>" | 2429 title = "<code>String.encode (s)</code>" |
| 2420 content = function() | 2430 content = function() |
| 2421 %> | 2431 %> |
| 2422 <p> | 2432 <p> |
| 2423 Encodes argument <code>s</code> into a string that can be placed in quotes so as to return the original value of the string. | 2433 Encodes argument <code>s</code> into a string that can be placed in quotes so as to return the original value of the string. |
| 2434 </p> | |
| 2435 <% | |
| 2436 end | |
| 2437 } | |
| 2438 ["String.ends_with"] = { | |
| 2439 title = "<code>String.ends_with (s, s2)</code>" | |
| 2440 content = function() | |
| 2441 %> | |
| 2442 <p> | |
| 2443 Returns a boolean indicating whether the <code>s</code> ends with <code>s2</code>. | |
| 2424 </p> | 2444 </p> |
| 2425 <% | 2445 <% |
| 2426 end | 2446 end |
| 2427 } | 2447 } |
| 2428 ["String.find"] = { | 2448 ["String.find"] = { |
| 2632 return String.match(s,pattern) ~= nil | 2652 return String.match(s,pattern) ~= nil |
| 2633 </pre> | 2653 </pre> |
| 2634 <% | 2654 <% |
| 2635 end | 2655 end |
| 2636 } | 2656 } |
| 2657 ["String.regex"] = { | |
| 2658 title = "<code>String.regex (s)</code>" | |
| 2659 content = function() | |
| 2660 %> | |
| 2661 <p> | |
| 2662 Returns a <a href="#regex_table">regex</a> table for the pattern <code>s</code>. | |
| 2663 </p> | |
| 2664 <% | |
| 2665 end | |
| 2666 } | |
| 2637 ["String.regex_quote"] = { | 2667 ["String.regex_quote"] = { |
| 2638 title = "<code>String.regex_quote (s)</code>" | 2668 title = "<code>String.regex_quote (s)</code>" |
| 2639 content = function() | 2669 content = function() |
| 2640 %> | 2670 %> |
| 2641 <p> | 2671 <p> |
| 2672 title = "<code>String.split (s, pattern [, limit])</code>" | 2702 title = "<code>String.split (s, pattern [, limit])</code>" |
| 2673 content = function() | 2703 content = function() |
| 2674 %> | 2704 %> |
| 2675 <p> | 2705 <p> |
| 2676 Splits <code>s</code> using regex <code>pattern</code> and returns the results. If <code>limit</code> is positive, then only returns at most that many results. If <code>limit</code> is zero, then remove trailing empty results. | 2706 Splits <code>s</code> using regex <code>pattern</code> and returns the results. If <code>limit</code> is positive, then only returns at most that many results. If <code>limit</code> is zero, then remove trailing empty results. |
| 2707 </p> | |
| 2708 <% | |
| 2709 end | |
| 2710 } | |
| 2711 ["String.starts_with"] = { | |
| 2712 title = "<code>String.starts_with (s, s2)</code>" | |
| 2713 content = function() | |
| 2714 %> | |
| 2715 <p> | |
| 2716 Returns a boolean indicating whether the <code>s</code> starts with <code>s2</code>. | |
| 2677 </p> | 2717 </p> |
| 2678 <% | 2718 <% |
| 2679 end | 2719 end |
| 2680 } | 2720 } |
| 2681 ["String.sub"] = { | 2721 ["String.sub"] = { |
| 2778 Receives a string and returns a copy of this string with all | 2818 Receives a string and returns a copy of this string with all |
| 2779 lowercase letters changed to uppercase. | 2819 lowercase letters changed to uppercase. |
| 2780 All other characters are left unchanged. | 2820 All other characters are left unchanged. |
| 2781 The definition of what a lowercase letter is depends on the current locale. | 2821 The definition of what a lowercase letter is depends on the current locale. |
| 2782 </p> | 2822 </p> |
| 2823 <% | |
| 2824 end | |
| 2825 } | |
| 2826 } | |
| 2827 } | |
| 2828 regex_table = { | |
| 2829 title = "Regular Expressions" | |
| 2830 content = function() | |
| 2831 %> | |
| 2832 <p> | |
| 2833 Regular expressions are handled using a regex table generated by <a href="#String.regex">String.regex</a>. | |
| 2834 </p> | |
| 2835 | |
| 2836 <p> | |
| 2837 Pattern matching is based on the Java <a href="http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html">Pattern</a> class. | |
| 2838 </p> | |
| 2839 <% | |
| 2840 end | |
| 2841 subs = { | |
| 2842 ["regex.find"] = { | |
| 2843 title = "<code>regex.find (s [, init])</code>" | |
| 2844 content = function() | |
| 2845 %> | |
| 2846 <p> | |
| 2847 Looks for the first match of | |
| 2848 the regex in the string <code>s</code>. | |
| 2849 If it finds a match, then <code>find</code> returns the indices of <code>s</code> | |
| 2850 where this occurrence starts and ends; | |
| 2851 otherwise, it returns <b>nil</b>. | |
| 2852 A third, optional numerical argument <code>init</code> specifies | |
| 2853 where to start the search; | |
| 2854 its default value is 1 and can be negative. | |
| 2855 </p> | |
| 2856 | |
| 2857 <p> | |
| 2858 If the regex has captures, | |
| 2859 then in a successful match | |
| 2860 the captured values are also returned, | |
| 2861 after the two indices. | |
| 2862 </p> | |
| 2863 <% | |
| 2864 end | |
| 2865 } | |
| 2866 ["regex.gmatch"] = { | |
| 2867 title = "<code>regex.gmatch (s)</code>" | |
| 2868 content = function() | |
| 2869 %> | |
| 2870 <p> | |
| 2871 Returns an iterator function that, | |
| 2872 each time it is called, | |
| 2873 returns the next captures from the regex | |
| 2874 over the string <code>s</code>. | |
| 2875 If the regex specifies no captures, | |
| 2876 then the whole match is produced in each call. | |
| 2877 </p> | |
| 2878 | |
| 2879 <p> | |
| 2880 As an example, the following loop | |
| 2881 will iterate over all the words from string <code>s</code>, | |
| 2882 printing one per line: | |
| 2883 </p> | |
| 2884 <pre> | |
| 2885 local r = String.regex[[\w+]] | |
| 2886 local s = "hello world from Lua" | |
| 2887 for w in r.gmatch(s) do | |
| 2888 print(w) | |
| 2889 end | |
| 2890 </pre> | |
| 2891 | |
| 2892 <p> | |
| 2893 The next example collects all pairs <code>key=value</code> from the | |
| 2894 given string into a table: | |
| 2895 </p> | |
| 2896 <pre> | |
| 2897 local t = {} | |
| 2898 local r = String.regex[[(\w+)=(\w+)]] | |
| 2899 local s = "from=world, to=Lua" | |
| 2900 for k, v in r.gmatch(s) do | |
| 2901 t[k] = v | |
| 2902 end | |
| 2903 </pre> | |
| 2904 | |
| 2905 <p> | |
| 2906 For this function, a caret '<code>^</code>' at the start of a pattern does not | |
| 2907 work as an anchor, as this would prevent the iteration. | |
| 2908 </p> | |
| 2909 <% | |
| 2910 end | |
| 2911 } | |
| 2912 ["regex.gsub"] = { | |
| 2913 title = "<code>regex.gsub (s, repl [, n])</code>" | |
| 2914 content = function() | |
| 2915 %> | |
| 2916 <p> | |
| 2917 Returns a copy of <code>s</code> | |
| 2918 in which all (or the first <code>n</code>, if given) | |
| 2919 occurrences of the regex have been | |
| 2920 replaced by a replacement string specified by <code>repl</code>, | |
| 2921 which can be a string, a table, or a function. | |
| 2922 <code>gsub</code> also returns, as its second value, | |
| 2923 the total number of matches that occurred. | |
| 2924 The name <code>gsub</code> comes from <em>Global SUBstitution</em>. | |
| 2925 </p> | |
| 2926 | |
| 2927 <p> | |
| 2928 If <code>repl</code> is a string, then its value is used for replacement. | |
| 2929 The character <code>\</code> works as an escape character. | |
| 2930 Any sequence in <code>repl</code> of the form <code>$<em>d</em></code>, | |
| 2931 with <em>d</em> between 1 and 9, | |
| 2932 stands for the value of the <em>d</em>-th captured substring. | |
| 2933 The sequence <code>$0</code> stands for the whole match. | |
| 2934 </p> | |
| 2935 | |
| 2936 <p> | |
| 2937 If <code>repl</code> is a table, then the table is queried for every match, | |
| 2938 using the first capture as the key. | |
| 2939 </p> | |
| 2940 | |
| 2941 <p> | |
| 2942 If <code>repl</code> is a function, then this function is called every time a | |
| 2943 match occurs, with all captured substrings passed as arguments, | |
| 2944 in order. | |
| 2945 </p> | |
| 2946 | |
| 2947 <p> | |
| 2948 In any case, | |
| 2949 if the regex specifies no captures, | |
| 2950 then it behaves as if the whole regex was inside a capture. | |
| 2951 </p> | |
| 2952 | |
| 2953 <p> | |
| 2954 If the value returned by the table query or by the function call | |
| 2955 is not <b>nil</b>, | |
| 2956 then it is used as the replacement string; | |
| 2957 otherwise, if it is <b>nil</b>, | |
| 2958 then there is no replacement | |
| 2959 (that is, the original match is kept in the string). | |
| 2960 </p> | |
| 2961 | |
| 2962 <p> | |
| 2963 Here are some examples: | |
| 2964 </p> | |
| 2965 <pre> | |
| 2966 local r = String.regex[[(\w+)]] | |
| 2967 local x = r.gsub("hello world", "$1 $1") | |
| 2968 --> x="hello hello world world" | |
| 2969 | |
| 2970 local r = String.regex[[(\w+)]] | |
| 2971 local x = r.gsub("hello world", "$0 $0", 1) | |
| 2972 --> x="hello hello world" | |
| 2973 | |
| 2974 local r = String.regex[[(\w+)\s*(\w+)]] | |
| 2975 local x = r.gsub("hello world from Luan", "$2 $1") | |
| 2976 --> x="world hello Luan from" | |
| 2977 | |
| 2978 local r = String.regex[[\$(.*?)\$]] | |
| 2979 local x = r.gsub("4+5 = $return 4+5$", function(s) | |
| 2980 return load(s)() | |
| 2981 end) | |
| 2982 --> x="4+5 = 9" | |
| 2983 | |
| 2984 local r = String.regex[[\$(\w+)]] | |
| 2985 local t = {name="lua", version="5.3"} | |
| 2986 local x = r.gsub("$name-$version.tar.gz", t) | |
| 2987 --> x="lua-5.3.tar.gz" | |
| 2988 </pre> | |
| 2989 <% | |
| 2990 end | |
| 2991 } | |
| 2992 ["regex.match"] = { | |
| 2993 title = "<code>regex.match (s [, init])</code>" | |
| 2994 content = function() | |
| 2995 %> | |
| 2996 <p> | |
| 2997 Looks for the first <em>match</em> of | |
| 2998 the regex in the string <code>s</code>. | |
| 2999 If it finds one, then <code>match</code> returns | |
| 3000 the captures from the regex; | |
| 3001 otherwise it returns <b>nil</b>. | |
| 3002 If the regex specifies no captures, | |
| 3003 then the whole match is returned. | |
| 3004 A third, optional numerical argument <code>init</code> specifies | |
| 3005 where to start the search; | |
| 3006 its default value is 1 and can be negative. | |
| 3007 </p> | |
| 3008 <% | |
| 3009 end | |
| 3010 } | |
| 3011 ["regex.matches"] = { | |
| 3012 title = "<code>regex.matches (s)</code>" | |
| 3013 content = function() | |
| 3014 %> | |
| 3015 <p> | |
| 3016 Returns a boolean indicating whether the regex can be found in string <code>s</code>. | |
| 3017 This function is equivalent to | |
| 3018 </p> | |
| 3019 <pre> | |
| 3020 return regex.match(s) ~= nil | |
| 3021 </pre> | |
| 2783 <% | 3022 <% |
| 2784 end | 3023 end |
| 2785 } | 3024 } |
| 2786 } | 3025 } |
| 2787 } | 3026 } |
| 3372 max-width: 700px; | 3611 max-width: 700px; |
| 3373 } | 3612 } |
| 3374 p[keywords] span { | 3613 p[keywords] span { |
| 3375 display: inline-block; | 3614 display: inline-block; |
| 3376 width: 100px; | 3615 width: 100px; |
| 3616 } | |
| 3617 code { | |
| 3618 font-size: 16px; | |
| 3619 font-weight: bold; | |
| 3620 } | |
| 3621 div[toc] code { | |
| 3622 font-size: inherit; | |
| 3623 font-weight: inherit; | |
| 3377 } | 3624 } |
| 3378 </style> | 3625 </style> |
| 3379 </head> | 3626 </head> |
| 3380 <body> | 3627 <body> |
| 3381 <% docs_header() %> | 3628 <% docs_header() %> |
