RegexKit

An Objective-C Framework for Regular Expressions using the PCRE Library for Mac OS X Cocoa and GNUstep
Download

RegexKit Ranking & Summary

Advertisement

  • Rating:
  • License:
  • BSD
  • Price:
  • FREE
  • Publisher Name:
  • John Engelhart
  • Publisher web site:
  • Operating Systems:
  • 10.5
  • File Size:
  • 1.7 MB

RegexKit Tags


RegexKit Description

An Objective-C Framework for Regular Expressions using the PCRE Library for Mac OS X Cocoa and GNUstep RegexKit is an Objective-C framework for regular expressions:· Support for Mac OS X Cocoa and GNUstep. · No sub-classing required. Seamlessly adds regular expression support to all NSArray, NSDictionary, NSSet, and NSString Foundation objects with a rich set of Objective-C category additions.· Unicode UTF-8 supported.· Extensive, high quality documentation.· Full source code with a BSD license.· Uses the BSD licensed PCRE Perl Compatible Regular Expressions library for the regular expression engine.Includes support for Mac OS X 10.5 Leopard:· 64 bit support. Pre-built for ppc, ppc64, i386, and x86_64.· Garbage Collection enabled. Complete support for Leopards Garbage Collection feature.· Integrated Xcode 3.0 documentation. Get real time API information via the Research Assistant. Here are some key features of "RegexKit": · Caches the compiled form of the regular expression for speed. · Multithreading safe, including multiple reader, single writer multithreaded access to the compiled regex cache. · Makes minimal use of heap storage (ie, malloc() and free()), instead allocating most temporary buffer needs dynamically from the stack. · Uses Core Foundation directly on Mac OS X for additional speed. What's New in This Release: · Modified the RKPrettyObject macros to be a function instead of a preprocessor macro. This shaves ~30K off the executable, or 4-7K per architecture. · Added a PerformanceNote if pcre_study was able to optimize the regular expression. · Added XCODE_CFLAGS_* and PCRE_CFLAGS_* to RegexKit Build Settings.xcconfig to add OS specific CFLAGS to Xcode and PCRE built sources. · Using the new CFLAGS added above, added -fstack-protector-all to the Mac OS X 10.5 targets. -fstack-protector-all is a GCC feature that Apple backported from later releases of GCC that can catch many stack smashes. · Modified projectExportedSymbols to export the new RKErrorKey, RKErrorDomain, and RKRegexPCRELibrary NSString constants. · Added RegexKit private extensions to NSException and NSError to simplify creating localized versions of each for simple instantiations. · Added several RegexKit private pre-processor macros to simplify creating localized strings. These are similar in nature to Foundations NSLocalizedString family of macros. · Converted RKConvertUTF8ToUTF16RangeForString and RKConvertUTF16ToUTF8RangeForString to StringBuffer equivalents so internal routines could call the more efficient StringBuffer functions. The plain string methods became wrappers around the StringBuffer functions. · Created a RegexKit private function RKLocalizedStringForPCRECompileErrorCode that returns an error description string that is better suited to NSError descriptions that end users see. The localized strings ultimately come from the frameworks bundle resource file pcre.strings. · Created a RegexKit private global variable, RKFrameworkBundle that is created by RKRegex at load time that provides the means to access localized strings. · Created a private function that creates a NSException from an NSError in the same way that initWithRegexString:options: method previously did since that method now creates and returns an NSError for most error conditions. · Modified RKRegex isEqual: to directly access the comparison objects instance variables if it's a RKRegex class object for speed. · Added a RegexKit private function to return the number of bytes for a UTF8 encoded character at a given pointer, and a function that returns the range for a UTF8 encoded character at a given offset of a pointer. If the offset points to the middle of a UTF8 encoded character, it will back up to the start of the UTF8 character that is at the given offset, then determine the number of bytes required to represent a single Unicode code point encoded in UTF8 format. Expanded the locking strategies that the framework private locking class provides. Previously the locking class only provided a blocking acquire strategy. Strategies now available include: · Try for reading. · Try for writing. · Try for writing, then try for reading. · Try for writing, then blocking acquire for reading. · Blocking acquire for reading. · Blocking acquire for writing. · This change was made to permit certain caching functions to be non-blocking when a thread is unable to acquire the requested degree of mutual exclusion for a shared resource immediately. · Added a new private locking class, RKConditionLock, that is similar to its Foundations counterpart. This was done to provide a faster function call interface and enhanced locking methodologies, along with NSTimeInterval based relative times instead of the much more expensive NSDate object based times. Using NSDate objects incurs a significant performance penalty because of the object creation and destruction overhead to ultimately convey a double value as an argument. By directly providing a relative time double passed parameter argument, that overhead is completely bypassed. The two functions RKFastConditionLock and RKFastConditionUnlock provide all the functionality, which can be called directly, and the object oriented interface methods are just stubs for these two functions. These functions and objects are not exported and are framework private. · Consolidated a lot of common logic for the locks in to the two functions RKFastMutexLock and RKFastMutexUnlock. The RKLock class was moved to this common code base, but for the time being, RKReadWriteLock remains unaltered. · The pthread mutexes created by RKLock and RKConditionLock are no created with the pthread mutex attribute PTHREAD_MUTEX_ERRORCHECK which causes extra sanity checks to be performed, such as the same thread locking a locked mutex, unlocking an unlocked mutex, or a thread attempting to unlock a mutex that was locked by a different thread. · Updated the license displayed in the installer to include the PCRE license explicitly. · Added the PCRE license to the project root directory LICENSE file. · Added the LICENSE file to the RegexKit Framework targets Copy Bundle Resources build phase so the license is present in the frameworks Resources directory. · Added RKAtomicBarrier macros / functions that perform full memory barrier semantics for architectures where this makes a difference. · Changed the BeginLock and EndLock dtrace probes second argument from int to NSInteger to match the information now provided by RKLock and RKReadWriteLock. The locking strategy requested and the final lock level acquired are now reported instead of a simple boolean read / write indication. · Added BeginLock, EndLock, and Unlock to RegexKit.usdt. · Split the header file RegexKitPrivate.h in to several files, RegexKitPrivateAtomic.h, RegexKitPrivateDTrace.h, RegexKitPrivateLocalization.h, RegexKitPrivateMemory.h, RegexKitPrivateThreads.h, and NSStringPrivate.h. · Changed RKRegex so that all of the class initialization takes place in the initialize method. Previously, some initialization took place in the load method which meant it was executed even if the class was ultimately not used. Also added guard checks at some function entry points since calling a function would not trigger the initialize behavior. · Updated generateHTML.pl to properly iterate over the groups in the Constants Table of Contents entry. Previously, this was manually updated for each new group. · Updated Copyright for 2008. · Added the ability to specify the availability (introduced in, deprecated in, removed in version, etc) to the documentation system with the file availability.sql. Updated the docset tools to use this information when creating the Tokens.xml file. · Altered the structure of the various unit tests. There was a lot of code that, over time, had managed to get replicated in several files and then drift apart over time. This was consolidated in to RKTestCase.m, which creates a common base object that is a subclass of SenTestCase that the RegexKit unit test objects inherit from. · Removed the Mac OS X malloc stats functionality from the unit tests NSDate object. Also removed the NSHighResTimeInterval type and replaced it with NSTimeInterval as both were of type double. · Added sortedRegexCollection.m to hold tests related to the new multithreaded sorted regex collection functionality. Bug fixes: · In RegexKit.usdt, the PerformanceNote probe arguments for Severity and generalStartEnd were swapped. This was corrected. · Fixed a type-o in RegexKit_match_timing.instrument. Somehow, "%x" was changed to b which caused the instrument to not be legal to parse and thus not show up in Instruments.app. · Fixed some errors in some HTML files and the print.css style sheet that would cause some titles to be negatively offset past the printable border. · Fixed a Firefox display bug in common.css that caused sourcecode boxes to not be formatted. · The RKReadWriteLock class would harmlessly display an incorrect count of the number of spurious errors attempts out of a maximum number of attempts. · The RKReadWriteLock class would harmlessly increment an internal debugging counter twice if it was unable to acquire a write level lock on the first attempt. · The RKReadWriteLock class would incorrectly update an internal ivar regarding the read or write condition of the lock regardless of whether or not a pthread error prevented the lock from being acquired. · Changed the RKRegex retain and release methods to use the RKAtomicBarrier routines to enforce a full memory barrier. This may have led to race conditions on architectures in which this makes a difference, such as the PowerPC architecture, and when multiple CPUs are attempting to update the same memory location simultaneously. · Changed the framework internal RKRegexFromStringOrRegex functions so that when an object is determined to be a member of the RKRegex class, but the options specified in the instantiated regex do not match the required options, the class of the instantiated regex is used to create a new RKRegex with the required options instead of using the base RKRegex class. This would make a difference only to a subclass of RKRegex that over-rode the object creation process. · SourceForge bug 1850418 - 'Error linking under 10.4'. This issue is covered in Release Information - Release Notes for 0.6.0 Beta. Update: Resubmitted this bug to Apple as bug # 5708443. The original bug report was closed as Behaves correctly. The justification given is "It is the same for linking - the 10.4 based linker errors out when it sees things it does not understand in the 10.5 libSystem.dylb. (sic)" and "In regards to the second post, there is no bug in the 10.5 linker. It is fine to link against the 10.5 libSystem.dylib with -macosx_version_min=10.4. The dtrace section is part of the implementation of libSystem.dylib. It is not part of the interface to the dylib (but the 10.4 linker does not know that)." To be honest, I'm sort of at a loss as to how the engineer managed to leap from the bug to the justifications given for closing it out as Behaves correctly. · SourceForge bug 1878659 - 'Does not build on 10.5 systems building a 10.4 target'. Fixed the conditional of RK_REQUIRES_NIL_TERMINATION. The previous conditional incorrectly redefined it as NS_REQUIRES_NIL_TERMINATION when building on a 10.5 system, but strictly targeting 10.4. Added defined(NS_REQUIRES_NIL_TERMINATION) to further restrict the conditional. Also updated ENABLE_MACOSX_GARBAGE_COLLECTION to be further restricted with defined(__OBJC_GC__) and ENABLE_DTRACE_INSTRUMENTATION to be further restricted by defined(S_DTRACE_DOF), which is defined in mach-o/loader.h. · Fixed a bug first reported by Doug Dickinson in the SourceForge RegexKit forum message 'match/replace with empty reference string?'. This turned out to be a bug in the NSString.m private function RKStringByMatchingAndExpanding. This function had an optimization where if no replacements took place, no changes were made to the original, and it could return the original string un-edited instead of creating a new one. Unfortunately the case where the regex matched the beginning of the string to be searched, but the replacement string was 'empty' (i.e., @""), this appeared as if no changes took place since no replacements were required. Fixed by also checking the final NSRange of the edited string against the original string to search. If a 'match at the start, but replace with nothing' happens now (i.e., as if the NSRange location had moved from 0 to a value > 0), these ranges will be different and will now correctly return a string with the start cut off. This may have also effected similar search and replaces that took place on the tail end of a string, but the fix will catch that condition as well.


RegexKit Related Software