Background Information (not part of the report) Useful Links: LSB Specification : http://www.linuxbase.org/spec/ Single UNIX Specification V3, ISO/IEC 9945:2002 http://www.unix.org/version3/ http://www.unix.org/single_unix_specification/ White Paper on API standards http://www.opengroup.org/~ajosey/wp-apis.txt --------------------------------------------------------------------------- 001 ----------------------------------------------------------------- 002 THIS IS A WORKING DRAFT SUBJECT TO CHANGE. THE FINAL DRAFT WILL 003 BE BASED ON LSB 1.9. THIS IS BASED ON AN EARLIER LSB DRAFT. 004 ----------------------------------------------------------------- 005 Technical Report (informative) - DRAFT 006 Topic: Conflicts between ISO/IEC 9945 (POSIX) and the Linux 007 Standard Base. 008 Status: Unapproved Draft in Progress 009 Author: Andrew Josey, The Open Group 010 Version 0.85 011 Date: 28 July 2003 012 Introduction: 013 /* Notice on Draft: 014 * This is a draft based on the current LSB 1.8 draft in progress. 015 * When LSB 1.9 is available this document will be reworked. 016 */ 017 1.1 Purpose 018 The purpose of this Type 3 Technical Report (informative) is to document 019 the areas of conflict between ISO/IEC 9945 (POSIX) and the Free Standards 020 Group's Linux Standard Base specification such that it can be utilized 021 by the appropriate technical committees when considering harmonization 022 between the standards efforts. 023 ISO/IEC 9945 (POSIX) is an important standard in use throughout the world. 024 There is a significant investment in applications developed for the ISO 025 POSIX standard. With the emergence of a standardization initiative for 026 the Linux operating system there are some areas of conflict that have 027 been identified between the Linux Standard Base specification and the 028 ISO POSIX standards. There is an essential market requirement that the 029 conflicts be resolved so that an application can be written to conform 030 to both standards. Hundreds of millions of dollars of applications are 031 built upon these standards. This report is intended as a starting point 032 to look at resolution of this issue. 033 1.2 Scope 034 The JTC 1/SC 22 Linux Study Group meeting (May 2003) recommended future 035 action towards adopting Linux as a JTC1 standard, and most likely 036 adopting the Linux Standard Base (LSB) specification as a Publicly 037 Available Specification (PAS). The Free Standards Group is in the process 038 of applying for PAS submitter status to JTC 1 The Free Standards Group 039 intend to submit the LSB for PAS approval. The scope of this technical 040 report is to identify areas of conflict between the LSB 1.9 specification 041 and the ISO/IEC 9945 (POSIX) standard. 042 --------------------------------------------------------------- 043 Please note that: 044 THIS IS A WORKING DRAFT SUBJECT TO CHANGE. THE FINAL DRAFT WILL 045 BE BASED ON LSB 1.9. THIS IS BASED ON AN EARLIER LSB DRAFT, 046 since at the time of writing the LSB 1.9 specification is not 047 available. 048 --------------------------------------------------------------- 049 1.3 Intended Audience 050 051 This document is intended to be submitted to JTC1 as a Technical Report. 052 It is anticipated that they should distribute it to workgroups 053 such as the Austin Group and the Linux Standard Base for which 054 it is in scope. It is also intended to be of interest to 055 systems engineers, technical managers and procurement officers. 056 057 1.4 Document Overview 058 059 --------------------------------------------------------------- 060 THIS IS A WORKING DRAFT SUBJECT TO CHANGE. THE FINAL DRAFT WILL 061 BE BASED ON LSB 1.9. THIS IS BASED ON AN EARLIER LSB DRAFT. 062 --------------------------------------------------------------- 063 This document is organized in the following ways: Section two provides 064 a list of differences that could be possible conflicts or extensions in 065 the System Interfaces. Section three provides a list of differences that 066 could be possible conflicts or extensions in the Shell and Utilities. 067 1.5 Acknowledgements 068 Extracts of this document are quoted from the ISO/IEC 9945:2002 and 069 Linux Standard Base documents. 070 Linux is a registered trademark of Linus Torvalds. 071 LSB is a trademark of the Free Standards Group. 072 POSIX is a registered trademark of the IEEE. 073 2. System Interfaces 074 This section describes possible areas of conflict between the LSB and 075 ISO/IEC 9945 (POSIX) for the System Interfaces. 076 This description is based on the work in progress version of the LSB 077 specification (LSB Common 1.8.0307024). Note that the descriptions of 078 the known conflicts are taken from the LSB and have not been verified 079 by the Austin Group, thus they may be subject to interpretation of the 080 standard. In some cases, the differences may be upward compatible 081 extensions. In cases where the LSB provides its own API manual page 082 rather than referencing ISO/IEC 9945 then that is noted here and 083 its possible that further investigation might determine that there 084 is no conflict. 085 2.1 Interface definitions 086 --------------------------------------------------------------- 087 THIS IS A WORKING DRAFT SUBJECT TO CHANGE. THE FINAL DRAFT WILL 088 BE BASED ON LSB 1.9. THIS IS BASED ON AN EARLIER LSB DRAFT. 089 --------------------------------------------------------------- 090 2.1.1 fcntl 091 LSB permits implementation to set O_LARGEFILE 092 According to the ISO/IEC 9945, only an application sets fcntl flags, such 093 as O_LARGEFILE. However, the LSB specification also allows implementations 094 to set O_LARGEFILE in a case in which the default behavior matches 095 the O_LARGEFILE behavior, for example off_t is 64 bits. The impact 096 is that applications when calling fcntl with the F_GETFL command may 097 receive the O_LARGEFILE flag set as well as the flags explicitly set by 098 the application. 099 2.1.2 gethostbyname 100 The LSB has its own definition of gethostbyname() and does not 101 reference ISO/IEC 9945. 102 2.1.3 getopt 103 The LSB documents a number of GNU extensions to getopt(). 104 It also references a PASC Interpretation 1003.2 #150, which is 105 incorporated into ISO/IEC 9945 and thus no longer relevant. 106 2.1.4 gets 107 The LSB has deprecated the gets() function, whereas it is a first 108 class function in ISO/IEC 9945 and ISO/IEC 9899. 109 2.1.5 getservbyname 110 The LSB has its own definition of getservbyname() and does not 111 reference ISO/IEC 9945. 112 2.1.6 getservent 113 The LSB has its own definition of getservent() and does not 114 reference ISO/IEC 9945. 115 2.1.7 ioctl 116 The LSB has its own definition of ioctl() and does not 117 reference ISO/IEC 9945. As well as a general ioctl() interface 118 this also includes the definition of a socket ioctl() 119 interface. 120 2.1.8 iswctype 121 The LSB has its own definition of iswctype() and does not 122 reference ISO/IEC 9945. 123 2.1.9 kill 124 Process ID -1 doesn't affect calling process 125 If pid is specified as -1, LSB says that sig shall not be sent to the 126 calling process, whereas ISO/IEC 9945 states "If pid is -1, sig shall be 127 sent to all processes (excluding an unspecified set of system processes) 128 for which the process has permission to send that signal.". 129 This was a deliberate Linus decision after an unpopular experiment 130 in including the calling process in the 2.5.1 kernel. See "What does 131 it mean to signal everybody?", Linux Weekly News, 20 December 2001, 132 http://lwn.net/2001/1220/kernel.php3 133 2.1.10 nice 134 LSB permits as deprecated behavior, the return value of a successful 135 call to nice() to be 0 (rather than the new nice value). A future version 136 of the LSB is expected to require the new nice value, as specified in 137 the ISO/IEC 9945. Until then, applications need to call the getpriority 138 function, rather than rely on the return value from nice() on LSB systems. 139 2.1.11 opterr, optind, optopt 140 The LSB has its own definition of opterr, optind and optopt and does not 141 reference ISO/IEC 9945. 142 2.1.12 strptime 143 The LSB documents an issue with limiting the number of leading zeroes. 144 This may be a conflict and needs further investigation 145 as to the interpretation of ISO/IEC 9945 . 146 LSB states: 147 "Number of leading zeroes limited 148 The Single UNIX Specification, Version 2 specifies fields for which 149 "leading zeros are permitted but not required"; however, applications must 150 not expect to be able to supply more leading zeroes for these fields than 151 would be implied by the range of the field. Implementations may choose 152 to either match an input with excess leading zeroes, or treat this as 153 a non-matching input. For example, %j has a range of 001 to 366, so 0, 154 00, 000, 001, and 045 are acceptable inputs, but inputs such as 0000, 155 0366 and the like are not. 156 Rationale 157 glibc developers consider it appropriate behavior to forbid excess 158 leading zeroes. When trying to parse a given input against several format 159 strings, forbidding excess leading zeroes could be helpful. For example, 160 if one matches 0011-12-26 against %m-%d-%Y and then against %Y-%m-%d, 161 it seems useful for the first match to fail, as it would be perverse to 162 parse that date as November 12, year 26. The second pattern parses it 163 as December 26, year 11. 164 The Single UNIX Specification is not explicit that an unlimited 165 number of leading zeroes are required, although it may imply this. 166 The LSB explicitly allows implementations to have either behavior. 167 Future versions of this standard may require implementations to forbid 168 excess leading zeroes." 169 2.1.13 strtok_r 170 The LSB has its own definition of strtok_r() and does not 171 reference ISO/IEC 9945. 172 2.1.14 system 173 The LSB has its own definition of system() and does not 174 reference ISO/IEC 9945. 175 2.1.15 unlink 176 May return EISDIR on directories 177 The LSB states that if path specifies a directory, a return 178 of EISDIR is permitted instead of EPERM as required by ISO/IEC 9945. 179 LSB notes that "The Linux kernel has deliberately chosen EISDIR for this 180 case and does not expect to change (Al Viro, personal communication)." 181 2.1.16 waitid 182 The LSB has deprecated the waitid() function, whereas it is a first 183 class function in ISO/IEC 9945 (but in the XSI option group). 184 2.1.17 waitpid 185 The LSB does not require implementations to support the 186 WCONTINUED or WIFCONTINUED functionality within waitpid(). 187 2.2 Pthreads Behavior 188 LSB permits implementations to partially support the ISO/IEC 9945 189 pthreads definitions. It is noted that this may change in a future 190 revision of the LSB. 191 LSB states the current exceptions as follows: 192 "POSIX specifies a concept of per-process rather than per-thread 193 signals. The LSB does not require this behavior; traditional Linux 194 implementations have had per-thread signals only. A related issue is 195 that applications cannot rely on getpid() returning the same value in 196 different threads. 197 Note: one implication of per-thread signals is that a core dump (for 198 example) may not stop all threads in a given process. This may be an 199 issue when designing ways to stop/start applications. 200 Applications which create child processes (using fork() and the like) 201 must then wait for them (using waitpid() family of functions) in the same 202 thread as they created them. Note that coding applications this way will 203 work both with full POSIX threads and legacy Linux thread implementations. 204 POSIX specifies that changing the user or group id instantly affects the 205 behavior of all threads. This behavior is not specified; applications 206 must use their own lock if they need this behavior. Rationale: it seems 207 unnecessary and it is a performance hit (an SMP kernel must lock the 208 user id). 209 Although this standard doesn't have a way to list processes (/proc or "ps" 210 command line isn't in, right?), it is our intention to not specify one 211 way or the other whether multiple threads appear as separate processes 212 or as a single process. 213 Applications cannot rely on resource limits (getrusage and setrusage) 214 being maintained per-process rather than per-thread. 215 Applications must disconnect from the controlling tty before calling 216 pthread_create. 217 times() need not account for all threads; it may just account for 218 the caller. 219 Applications must not call pthread_cancel if they call any system 220 libraries (most notably X Window System libraries), as system libraries 221 are not guaranteed to be thread safe. Likewise, for such libraries, 222 only one thread per process may call them. 223 Applications cannot rely on fcntl/lockf locks being visible per-process 224 rather than per-thread. Likewise for mandatory file locks. 225 Threaded applications cannot use SIGUSR1 or SIGUSR2." 226 3. Shell and Utilities Interfaces 227 This section describes possible areas of conflict between the LSB and 228 ISO/IEC 9945 (POSIX) for the Shell and Utilities. 229 This description is based on the work in progress version of the LSB 230 specification (LSB Common 1.8.0307024). Note that the descriptions of 231 the known conflicts are taken from the LSB and have not been verified 232 by the Austin Group, thus they may be subject to interpretation of the 233 standard. Deprecated differences are not listed since they are assumed 234 to be removed at some future point. 235 In some cases, the differences may be upward compatible extensions. In 236 cases where the LSB provides its own API manual page rather than 237 referencing ISO/IEC 9945 then that is noted here and its possible that 238 further investigation might determine that there is no conflict. 239 3.1 Utility definitions 240 --------------------------------------------------------------- 241 THIS IS A WORKING DRAFT SUBJECT TO CHANGE. THE FINAL DRAFT WILL 242 BE BASED ON LSB 1.9. THIS IS BASED ON AN EARLIER LSB DRAFT. 243 --------------------------------------------------------------- 244 3.1.1 ar 245 The LSB lists the following differences: 246 -T, -C 247 need not be accepted. 248 -l 249 has unspecified behavior. 250 -q 251 has unspecified behavior; using -r is suggested. 252 3.1.2 at 253 The LSB lists the following differences: 254 -d is functionally equivalent to the -r option specified in ISO/IEC 9945 255 -r need not be supported on LSB implementations, but the '-d' option 256 is equivalent. 257 -t time 258 need not be supported. 259 The files at.allow and at.deny reside in /etc rather than /usr/lib/cron 260 on LSB implementations. 261 3.1.3 awk 262 The LSB lists the following differences: 263 Certain aspects of internationalized regular expressions are optional. 264 3.1.4 batch 265 The LSB lists the following differences: 266 The files at.allow and at.deny reside in /etc rather than /usr/lib/cron 267 on LSB implementations. 268 3.1.5 bc 269 In order to obtain ISO/IEC 9945 conforming behavior, applications 270 are required to use the -s or --standard option to bc. 271 3.1.6 chown 272 The following is listed by the LSB as a difference but it is 273 probably an extension. 274 "The use of the '.' character as a separator between the specification of 275 the user name and group name is supported (in addition to the use of the 276 ':' character as specified in the Single UNIX Specification)." 277 3.1.7 cpio 278 The LSB lists the following differences: 279 Certain aspects of internationalized filename globbing are optional. 280 3.1.8 crontab 281 The LSB lists the following differences: 282 The files at.allow and at.deny reside in /etc rather than /usr/lib/cron 283 on LSB implementations. 284 3.1.9 cut 285 The LSB lists the following difference: 286 -n has unspecified behavior. 287 3.1.10 df 288 The LSB lists the following differences: 289 If the -k option is not specified, disk space is shown in unspecified 290 units. Applications should specify -k. 291 If an argument is the absolute file name of a disk device node containing 292 a mounted filesystem, df shows the space available on that filesystem 293 rather than on the filesystem containing the device node (which is always 294 the root filesystem). 295 3.1.11 du 296 The LSB lists the following differences: 297 If the -k option is not specified, disk space is shown in unspecified units. 298 Applications should specify -k. 299 3.1.12 echo 300 Unlike the behavior specified in ISO/IEC 9945, LSB states that support for 301 options is implementation defined, and that the behavior of echo if any 302 arguments contain backslashes is also implementation defined. Applications 303 are advised not to run echo with a first argument starting with a hyphen, 304 or with any arguments containing backslashes; they must use printf in 305 those cases. 306 3.1.13 find 307 The LSB lists the following differences: 308 Certain aspects of internationalized filename globbing are optional. 309 3.1.14 fuser 310 The LSB lists the following differences: 311 -c has unspecified behavior. 312 -f has unspecified behavior. 313 3.1.15 grep 314 The LSB lists the following differences: 315 Certain aspects of internationalized regular expressions are optional. 316 3.1.16 ipcrm 317 The LSB has its own definition of ipcrm and does not 318 reference ISO/IEC 9945. 319 3.1.16 ipcs 320 The LSB has its own definition of ipcs and does not 321 reference ISO/IEC 9945. 322 3.1.17 ls 323 The LSB lists the following differences: 324 -p 325 in addition to the behavior of printing a slash for a directory, 326 ls -p may display other characters for other file types. 327 Since the ISO/IEC 9945 only defines the behavior for directories 328 this appears to be an upward compatible extension. 329 The LSB states that certain aspects of internationalized filename globbing 330 are optional. 331 3.1.18 m4 332 The LSB lists these as differences, 333 -P forces a m4_ prefix to all builtins. 334 -I directory 335 Add directory to the end of the search path for includes. 336 These appear to be upward compatible extensions. 337 3.1.19 more 338 The LSB lists these as differences, 339 The more command need not respect the LINES and COLUMNS environment 340 variables. 341 The more command need not support the following interactive commands: 342 g 343 G 344 u 345 control u 346 control f 347 newline 348 j 349 k 350 r 351 R 352 m 353 ' (return to mark) 354 /! 355 ? 356 N 357 :e 358 :t 359 control g 360 ZZ 361 -num 362 specifies an integer which is the screen size (in lines). 363 -e 364 has unspecified behavior. 365 -i 366 has unspecified behavior. 367 -n 368 has unspecified behavior. 369 -p 370 Either (1) clear the whole screen and then display the text (instead 371 of the usual scrolling behavior), or (2) provide the behavior 372 specified by ISO/IEC 9945. In the latter case, the syntax is 373 "-p command". 374 -t 375 has unspecified behavior. 376 3.1.20 newgrp 377 The LSB has its own definition of newgrp and does not 378 reference ISO/IEC 9945. 379 3.1.21 od 380 The LSB lists these as differences. 381 -w, --width[=BYTES] 382 outputs BYTES bytes per output line. 383 --traditional 384 accepts arguments in pre-POSIX form described below 385 Pre-POSIX Specifications 386 The LSB supports option intermixtures with the following pre-POSIX 387 specifications: 388 -a 389 is equivalent to -t a, selects named characters. 390 -f 391 is equivalent to -t fF, selects floats. 392 -h 393 is equivalent to -t x2, selects hexadecimal shorts. 394 -i 395 is equivalent to -t d2, selects decimal shorts. 396 -l 397 is equivalent to -t d4, selects decimal longs. 398 3.1.22 patch 399 The LSB lists the following differences: 400 --binary 401 reads and write all files in binary mode, except for standard output 402 and /dev/tty. This option has no effect on POSIX-compliant systems. 403 -u, --unified 404 interprets the patch file as a unified context diff. 405 3.1.23 renice 406 The LSB lists the following differences: 407 -n increment 408 has unspecified behavior. 409 3.1.24 sed 410 The LSB lists the following differences: 411 Certain aspects of internationalized regular expressions are optional 412 3.1.25 split 413 The LSB lists the following differences: 414 -a suffix_length 415 has unspecified behavior but is expected to align with ISO/IEC 9945 416 in the future. 417 3.1.26 uname 418 The LSB lists the following differences: 419 -a 420 prints all information (not just the options specified in ISO/IEC 9945 421 3.1.27 wc 422 The LSB lists the following differences: 423 -m 424 has unspecified behavior. The LSB will require support for this 425 as specified in ISO/IEC 9945 in a future revision. 426 3.1.28 xargs 427 The LSB lists the following differences: 428 -E 429 has unspecified behavior. 430 -I 431 has unspecified behavior. 432 -L 433 has unspecified behavior. 434 3.2 Internationalization 435 The LSB makes certain internationalization aspects optional. 436 3.2.1 Regular Expressions 437 Utilities that process regular expressions shall support Basic Regular 438 Expressions and Extended Regular Expressions as specified in ISO/IEC 9945 439 with the following exceptions: 440 Range expression (such as [a-z]) can be based on code point order instead 441 of collating element order. 442 Equivalence class expression (such as [=a=]) and multi-character collating 443 element expression (such as [.ch.]) are optional. 444 Handling of a multi-character collating element is optional. 445 This affects at least the following utilities: grep (grep>) (including egrep), sed (sed>), and awk (awk>). 446 3.2.2 Filename Globbing 447 Utilities that perform filename globbing (also known as Pattern Matching 448 Notation) shall do it as specified in ISO/IEC 9945 with the following 449 exceptions: 450 Range expression (such as [a-z]) can be based on code point order instead 451 of collating element order. 452 Equivalence class expression (such as [=a=]) and multi-character collating 453 element expression (such as [.ch.]) are optional. 454 Handling of a multi-character collating element is optional. 455 3.3 Shell Exceptions 456 The LSB documents the following exceptions for the shell (sh utility) 457 from ISO/IEC 9945. 458 3.3.1 Pathname of $0 459 When the shell searches for a command name in the PATH and finds a shell 460 script, ISO/IEC 9945 specifies that it shall pass the command name as 461 argv[0] and in the child shell script, $0 shall be set from argv[0]. 462 (Note there is a defect report pending on this issue) 463 However, for an LSB shell, the system may implement either this behavior 464 or $0 may be set to an absolute pathname of the shell script. 465 3.3.2 Sourcing non-executable files 466 When PATH is used to locate a file for the dot utility, and a matching 467 file is on the PATH but is not readable, the behavior is undefined 468 (unlike ISO/IEC 9945 which LSB states requires the shell to continue 469 searching through the rest of the PATH , see the "dot" man page under 470 "Special built in utilities") 471 3.3.3 Globalized Pattern Matching 472 For filename globbing, globalized implementations shall provide the 473 functionality defined in ISO/IEC 9945, with the following exceptions: 474 Range expression (such as [a-z]) can be based on code point order instead 475 of collating element order. 476 Equivalence class expression (such as [=a=]) and multi-character collating 477 element expression (such as [.ch.]) are optional. 478 Handling of a multi-character collating element is optional. 479 ----- 480 Andrew Josey The Open Group 481 Director, Server Platforms Apex Plaza,Forbury Road, 482 Reading,Berks.RG1 1AX,England 483 Tel: +44 118 9508311 ext 2250 Fax: +44 118 9500110 484 UNIX is a registered trademark of The Open Group in the US 485 and other countries.