Wednesday, January 25, 2006

 

/sbin/ifconfig -a | /bin/grep HWaddr

/sbin/ifconfig -a | /bin/grep HWaddr


Sunday, January 15, 2006

 

Steps for creating a custom hotkey to launch any application in GNOME

Steps for creating a custom hotkey to launch any application in GNOME:

1. Open "gconf-editor" as the user as you're logged in in GNOME
2. Go to "apps", "metacity", "keybinding_commands"
3. Double click on e.g. "command_1"
4. Type in the name of the application you want to launch, e.g.
"gcalctool", the GNOME calculator
5. Go to "apps", "metacity", "global_keybindings"
6. Double click on e.g. "run_command_1"
7. Type in e.g. Control Alt c
8. Note the 'less than' and 'greater than' for the special function keys
9. DONE! Close the gconf-editor and press CTRL-ALT-c and the calculator
should come up


Thursday, January 05, 2006

 

white box black box

white box
Also known as glass box, structural, clear box and open box testing. A software testing technique whereby explicit knowledge of the internal workings of the item being tested are used to select the test data. Unlike black box testing, white box testing uses specific knowledge of programming code to examine outputs. The test is accurate only if the tester knows what the program is supposed to do. He or she can then see if the program diverges from its intended goal. White box testing does not account for errors caused by omission, and all visible code must also be readable

For a complete software examination, both white box and black box tests are required.

black box
Also known as functional testing. A software testing technique whereby the internal workings of the item being tested are not known by the tester. For example, in a black box test on a software design the tester only knows the inputs and what the expected outcomes should be and not how the program arrives at those outputs. The tester does not ever examine the programming code and does not need any further knowledge of the program other than its specifications.

The advantages of this type of testing include:

The test is unbiased because the designer and the tester are independent of each other.
The tester does not need knowledge of any specific programming languages.
The test is done from the point of view of the user, not the designer.
Test cases can be designed as soon as the specifications are complete.

The disadvantages of this type of testing include:

The test can be redundant if the software designer has already run a test case.
The test cases are difficult to design.
Testing every possible input stream is unrealistic because it would take a inordinate amount of time; therefore, many program paths will go untested.

For a complete software examination, both white box and black box tests are required

Monday, January 02, 2006

 

Reference Point linux performance tuning

Reference Point - linux performance and tuning refresher

File system

inode = data structure that is responsible for storing file-related information
ls -i

User ownership
Group ownership
Access mode (read, write, and execute permissions)
File type
Timestamp
File size
Pointers to data blocks

hard link, share the same inode, creates another instance of the file
!!KJR - think of it as one pointer to two data structures !!
Hard links can only be used within a single file system. ie. can not link across partitions.

symbolic (soft) link, different inodes, point to the same data structure.
ln -s
symbolic link is depicted using -> characters and beginning character l with ls -al
inode number that represents original is different from the inode number that represents soft link
create a soft link, a new file is created

Advantages of soft link:
Symbolic links can be created across file systems.
Symbolic links can point to any type of file.
Symbolic links occupy less disk space than hard links.

.....................................................................................................

cat /etc/fstab

LABEL=/ / ext3 defaults 1 1
none /dev/pts devpts gid=5,mode=620 0 0
none /proc proc defaults 0 0
none /dev/shm tmpfs defaults 0 0
/dev/hda2 swap swap defaults 0 0
/dev/cdrom /mnt/cdrom iso9660 noauto,owner,kudzu,ro 0 0
/dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0

fs_spec: This field describes the file system to be mounted. The file system can be a block special device or a remote file system.

fs_file: This field specifies the mount point. The mount point is the directory on which the file system can be mounted.

fs_vfstype: This field is used to specify the type of file system to be mounted.

fs_mntops: This field is used to specify the mount options that must be used to mount the particular file system.

fs_freq: This field is used to specify whether the specified file system should be backed up. The value 1 signifies that the file system can be backed up when the dump backup utility is used. The value 0 signifies that the particular file system will not be backed up when the dump utility is used.

fs_passno: This field is used to specify which file systems must be checked first when the system is restarted. Every time the system is booted, the file systems on the Linux system are checked for integrity. This field can be used to set the priority using which the file systems must be checked at the time of restarting the computer. The root file system should always have the option "1" set. This indicates that the root file system must be checked for integrity on a priority basis. Other file systems of the Linux system should have an fs_passno value of 2 to indicate that they should be checked only after the root file system is checked.

.....................................................................................................

virtual file system (VFS) - The interface of each lower-level file system

Tmpfs
# mount tmpfs /mnt/tmpfs –t tmpfs
It has a dynamic file system size. The tmpfs file system driver allocates more virtual memory while files are copied or created. This helps increase the file system capacity dynamically.
When you remove files from /mnt/tmpfs, the tmpfs file system driver frees virtual memory resources by reducing the file system size. As a result, virtual memory is made available to other parts of the system.
Speed is another great advantage of tmpfs. The tmpfs file system resides completely in RAM. As a result, read and write operations are performed faster. If swap devices are used, high performance is retained and parts of the tmpfs file system are moved to RAM because more free VM resources are available.
Virtual memory is unstable in nature. As a result, tmpfs data is not preserved between startups. This feature makes tmpfs the ideal file system for storing and managing critical data.

.....................................................................................................
Backup and Restore Tools
Command-Line: tar, cpio, dump, and restore
Tape Archiver (tar) based on ar

example. cd /usr/local
find . -print | cpio -pdmuv /destination/dir

.....................................................................................................

compile new kernel

tar xvfz linux-2.6.14.tar.gz

cd /usr/src/linux-2.6.14
make mrproper
cp ../kernels/2.6.11-1.1369_FC4-i686/.config .
make oldconfig
make menuconfig //or xconfig //or config
make dep
make clean //no longer needed for 2.6 kernels
make bzImage
make modules
make modules_install
mkinitrd /boot/initrd-2.6.14.img 2.6.14
cp arch/i386/boot/bzImage /boot/bzImage-2.6.14
cp System.map /boot/System.map-2.6.14
ln -s /boot/System.map-2.6.14 /boot/System.map
vi /boot/grub/menu.lst //or /boot/grub/grub.conf
Done. reboot

.....................................................................................................

Bash Environment Variables

$HOME: Contains the path of the home directory of the current end user.

$PATH: Contains a list of directories separated by colons. This variable specifies the default directories that contain shell commands executable from any directory.

$PS1: Contains the string that denotes the command prompt in a command line environment.

$PS2: Contains a secondary prompt string. This environment variable is generally used when prompting the end user for additional input.

$IFS: Contains the Internal Field Separator (IFS), which specifies a list of characters used to separate words when the shell is reading input.

$0: Contains the name of the shell script created by you.

$#: Contains the number of parameters passed to the shell script.

$$: Contains the process ID of the shell script.

Parameter Variables

$1, $2...: Contain the parameters passed to your shell script. For example, $1 and $2 contain the first and second parameters passed to your shell script.

$*: Contains a list of all the parameters separated by the first character in the IFS environment variable.

$@: Contains a list of all parameters passed to the shell script. Does not use the IFS environment variable.

Metacharacters in Bash

? * [...] [!...]: Is used to substitute characters within the name of a file. For example, you use the following command to display all the files and directories whose names start with the alphabet A:

$ls –ld A*

> < >> << m> m>&n: Is used to redirect the standard input, standard output, and the standard error to user-specified file descriptors.

; () & && ||: Is used with reference to process execution. You execute any executable file as a background process using the following command:

$file_name &

\ " ‘’ '': Is used to give special significance to certain commands, strings, and characters.

$1..$9: Contain the parameters passed to a shell script from the command line.

$0: Contains the name of the file from which the bash script is executed.

$*: Is a string containing all the parameters separated by the first character in the IFS environment variable.

$@: Contains all the parameters passed to the bash script being executed but does not use the IFS environment variable.

$#: Contains the number of parameters passed to the bash shell script.

$!: Contains the process ID of the last background process.

.....................................................................................................

String Comparison Using the test Command

#!/usr/bin/bash
string1=$1
string2=$2
if test –z $string1 –o –z $string2
then
echo "Invalid parameters"
exit 0
elif test "$string1" = "$string2"
then
echo "The strings are equal"
else
echo "The strings are not equal"
fi
exit 1

.....................................................................................................

File Conditionals

-d file: True, if file is a directory.

-e file: True, if file exists.

-f file: True, if file is a regular file.

-g file: True, if set-group-id is set on file.

-r file: True, if file is readable.

-s file: True, if file has a non-zero size.

-u file: True, if set-user-id is set on file.

-w file: True, if file is writeable.

-x file: True, if file is executable.

.....................................................................................................

Shell Commands in Bash

break: Quits a loop construct before the controlling condition has been met.

:: Performs no operation.

continue: Makes the loop construct continue to the next iteration, with the loop variable taking the value of the next value in the list.

.: Executes the command in the current shell instead of creating a new shell for its execution.

echo: Prints the parameter string on the standard output.

eval: Allows you to evaluate arguments.

exit n: Makes the Bash script exit with an exit code of n.

export: Makes a variable available to a shell script’s subshells.

expr: Evaluates its arguments as an arithmetic expression.

printf: Prints a formatted string to the standard output.

return: Returns control to the point from where a function was called.

set: Sets the parameter variables for the shell.

shift: Deletes the parameter at the first position and shifts all the other parameters to preceding positions. For example, the second parameter will become the first and the third parameter will become the second.

trap: Specifies actions to perform on the receipt of signals. A process in Linux can receive signals denoted by specific numbers that specify an occurrence in the environment in which the process executes.

unset: Removes variables or functions from the environment.

wc: Counts the number of lines, words, and characters in the specified file.

sort: Sorts the contents of the specified file.

cut: Cuts or picks up a given number of character or fields from the specified file.

grep: Searches the specified input for a match with the supplied pattern and displays it.

dd: Converts and copies a file. In addition, dd allows you to change the format of the data in the specified file.


.....................................................................................................

Special Variable

$? Variable
contains the exit status of the last command

$$ Variable
contains the process ID of current command

$! Variable
contains the process ID of the last command

$- Variable
contains teh set of flags that were specified when the shell was invoked. Or set with set

.....................................................................................................

Working with SELinux

Security Enhanced Linux (SELinux) is an implementation of the Mandatory Access Control (MAC) security system
user authentication, firewalls, and Discretionary Access Controls (DAC)

To implement access control, use the policy database of SELinux

SELinux implements MAC using Linux Security Module (LSM)
LSM provides a framework that allows you to include security modules

Type Enforcement (TE)
Role Based Access Control (RBAC)

Flask Architecture of SELinux Security Models
Flux Advanced Security Kernel (Flask) architecture
The Flask architecture manages the security policy in a module called SELinux security server
Subject: Specifies the processes of Linux that perform tasks, such as reading or creating a file.
Object: Specifies the Linux resources on which the processes operate, such as files, directories, file system, and ports.
Action: Specifies the operation that a subject performs on an object, such as append, delete, rename, lock, or execute.

The policy database of the SELinux security server allows or denies an action on an object using three security attributes
User Identity: Specifies the SELinux user account for a subject and an object. The user account for the subject is specified according to the running process and the user account of the object is specified according to the owner of the object.
Role: Specifies the set of permissions granted to a SELinux user. SELinux assigns different roles to the same SELinux user. However, one user cannot be assigned more than one role simultaneously. The role attribute has the _r name suffix.
Type: Specifies the security attributes applied on objects to identify users, who can access the objects. The type attribute has the _t name suffix.

The SELinux security server uses four types of security models to secure processes and files
TE model: Provides the type security attribute to processes and objects.
RBAC model: Helps assign roles to SELinux users and a set of permissions to each role.
User Identity model: Helps authorize SELinux users to access the resources on a Linux-based computer.
Network Security model: Specifies the access control policy to secure Linux-based computers on a network.
SELinux security server also uses the Multi-Level Security (MLS) model optionally
model helps categorize the resources of a Linux-based computer into different sensitivity levels
top secret, secret, confidential, and unrestricted
sensitivity levels represent different levels of security of resources, and help define a security policy for the resources

The TE Model
TE model helps provide access control to processes and objects
SELinux TE model provides two security decisions to the SELinux security server
Access decisions: Helps determine whether a subject is allowed to perform an action on an object or not.
Transition decisions: Helps determine type attribute created and assigned by a domain to an object. This decision is also called the labeling decision
access decision of the SELinux security server returns a TE access vector
Allow: Allows a subject to perform an action on an object.
Auditallow: Does not allow a subject to perform any action on an object.
Auditdeny: Prevents the auditing of a specific denied action. It allows SELinux to create logs for the denied actions

The RBAC Model
two types of RBAC model
conventional RBAC model
SELinux RBAC model
conventional RBAC model authorizes end users for specific roles and provides a set of permissions to each role
SELinux RBAC model authorizes each SELinux role for a set of TE domains and each SELinux user for a set of SELinux roles

/etc/selinux/strict/src/policy/users
user smith roles {user_r admin_r};
allow user_r admin_r;
role admin_r types security_t;


The User Identity Model
user identity model helps identify and authorize an end user to access the resources
SELinux user identity is independent of the Linux UID

The Network Security Model


Policycoreutils package: Provides the policy core utilities that help perform various SELinux operations
Load_policy: Helps load SELinux policies on the SELinux security server.
Setfiles: Helps label filesystems of SELinux.
Newrole: Helps users to change from one authorized role to another.
The run_init: Helps run the /etc/init.d script to initialize SELinux processes.

Policy package: Provides configuration information for SELinux policy
Libsepol: Provides an interface to manipulate the binary format of policies.
Slat: Represents a policy analysis tool that helps analyze the SELinux security policies.
Polgen: Helps generate SELinux policies.

Creating the /selinux Directory
/etc/fstab
none /selinux selinuxfs defaults 0 0

Navigate to the /etc/selinux directory.
Run the following make command at the shell prompt to label the filesystem:
make relabel
Run the following make command at the shell prompt to compile the SELinux policies:
make policy
After labeling the filesystem, you need to reboot the Linux-based computer. Relabel the filesystem again when the computer starts to label the files created while rebooting.
You can run the following command at the shell prompt to view the status of SELinux and verify its successful installation:
# sestatus

/boot/grub/menu.lst
kernel /vmlinuz-2.6.9-1.667 ro root=LABEL=/ enforcing=1

/etc/sysconfig/selinux

cat /selinux/enforce

changes the SELinux operation mode from enforcing to permissive
echo"0">/selinux/enforce command to 1

setenforce 0 //permissive

/etc/selinux/strict/src/policy
make load

checkpolicy [-b] [-c policyvers] [-d] [-o output_file] [input_file]

newrole –r sysadm_r

/usr/bin/chcon
/sbin/fixfiles
/sbin/restorecon
/usr/sbin/setfiles

Assign the role, sysadm_r, to the file or filesystem.
Navigate to the /etc/selinux/strict/src/policy directory.
Execute the following command to relabel a file or a filesystem:
# make relabel

Or

chcon system_u:object_r:etc_t /etc/hosts /etc/hosts.allow

...............................................................................................

/etc/passwd
root:x:0:0:root,9810541423,9810541423,913091830219:/root:/bin/csh
[username]:[passwd]:[UID]:[GID]:[full_name]:[directory]:[shell]

[username] is the logon name of the user.

[passwd] is the encoded password of the user.

[UID] is the user ID.

[GID] is the group ID of the user.

[full_name] is the full name of the user.

[directory] is the home directory of the user.

[shell] is the logon shell of the user.

/etc/shadow
root:$1$YgmpAbXE$9h3ghaSqjZYOrMt8ZNwBN1:11767:0:99999:7:::
[username]:[passwd]:[last]:[may]:[must]:[warn]:[expire]:[disable]:[reserved]

[username] is the user name.

[passwd] is the encoded password.

[last] is the number of days after the password was last changed, since January 1, 1970.

[may] is the number of days before the password may be changed.

[must] is the number of days after which the password must be changed.

[warn] is the number of days before the password is due to expire when the user is warned.

[expire] is the number of days after the password expires when the account is disabled.

[disable] is the number of days since January 1, 1970, when the account is disabled.

[reserved] is a reserved field.

encrypted using the crypt encryption function
Data Encryption Standard (DES) algorithm

chage
change age
for password expirations

chage [-m] [-M] [-d] [-I] [-E] [-W] user

gpasswd
set a group password

pwck
pwck command verifies the integrity of password files
/etc/passwd
/etc/shadow

grpck
grpck command verifies the integrity of group files
/etc/group
/etc/gshadow

pwconv and pwunconv
verfies passwd and shadow. converts passwords

Pluggable Authentication Modules (PAM)
/etc/pam.conf (olde)
/etc/pam.d

/usr/lib/security //modules

The /etc/pam.conf File
[service-name] [module-type] [control-flag] [module-path] [arguments]

module-type control-flag module-path arguments

/etc/pam.d/login


.................................................................................

check services running?
netstat -a -p
or
netstat -nlut

.................................................................................

chkconfig --level 0123456 ipchains off
service ipchains stop:
or /etc/init.d/ipchains stop

chkconfig --level 235 iptables on
.................................................................................
Phases in Packet Routing

PRE_ROUTING

LOCAL_IN

FORWARD

LOCAL_OUT

POST_ROUTING

.............
Pre-defined Chains
The pre-defined chains in iptables are:

INPUT

FORWARD

OUTPUT

PRE_ROUTING

POST_ROUTING
/////////////////////////////////////////////////////////////////////
.............
#iptables
#eth0 #LAN 10.1.1.1, 192.168.1.1
#eth1 #WAN 203.200.89.1 to 203.200.89.4

#enable ip forwarding
echo 1 > /proc/sys/net/ipv4/ip_forward

#set policy
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT DROP

#flush rules
iptables -F
iptables -t nat -F
iptables -X
iptables -t nat -X

#zero count
iptables -Z
iptables -t nat -Z

#Drop Suspicious Packets
for interface in /proc/sys/net/ipv4/conf/*/rp_filter
do
echo 1 > $interface
done

#Disable REDIRECTS
for interface in /proc/sys/net/ipv4/conf/*/send_redirects
do
echo 0 > $interface
done

#Load Protocol-Specific Modules
/sbin/modprobe ip_conntrack_ftp
/sbin/modprobe ip_nat_ftp

#Rule 1
iptables -A FORWARD -s 10.1.1.0/24 -j ACCEPT

#Rule 2
iptables -A FORWARD -m state --state ESTABLISHED,RELATED -d 10.1.1.0/24 -j ACCEPT

#Rule 3
iptables -A FORWARD -m limit -d 10.1.1.0/24 -j LOG --log-prefix "Invalid incoming connection: "

#Translating User LAN Addresses for the Internet
iptables -t nat -A POSTROUTING -s 10.1.1.0/24 -o eth1 -j SNAT --to 203.200.89.1

#Configuring for DMZ Access
iptables -A FORWARD -s 192.168.1.0/24 -j ACCEPT
iptables -A FORWARD -m state --state ESTABLISHED,RELATED -d 192.168.1.0/24 -j ACCEPT
iptables -A FORWARD -m limit -d 192.168.1.0/24 -j LOG --log-prefix "Invalid connection to DMZ: "

#SNAT for Outgoing Connections from Hosts in the DMZ
iptables -t nat -A POSTROUTING -s 192.168.1.2 -o eth1 -j SNAT --to 203.200.89.2

#Forwarding Incoming Connections to Appropriate Servers
iptables -t nat -A PREROUTING -d 203.200.89.2 -p tcp --dport 53 -j DNAT --to 192.168.1.2
iptables -t nat -A PREROUTING -d 203.200.89.2 -p udp --dport 53 -j DNAT --to 192.168.1.2
iptables -t nat -A PREROUTING -d 203.200.89.3 -p tcp --dport 25 -j DNAT --to 192.168.1.3
iptables -t nat -A PREROUTING -d 203.200.89.3 -p tcp --dport 110 -j DNAT --to 192.168.1.3
iptables -t nat -A PREROUTING -d 203.200.89.3 -p tcp --dport 143 -j DNAT --to 192.168.1.3
iptables -t nat -A PREROUTING -d 203.200.89.4 -p tcp --dport 80 -j DNAT --to 192.168.1.4

#Configuring Firewalls for the Firewall Host
#Rule 1:
iptables -A OUTPUT -j ACCEPT
#Rule 2:
iptables -A INPUT -m state --state ESTABLISHED,RELATED -i eth1 -j ACCEPT
#Rule 3:
iptables -A INPUT -m limit -i eth1 -j LOG --log-prefix "Invalid connect to f-wall: "
#Rule 4:
iptables -A INPUT -i Eth0 -j ACCEPT
/////////////////////////////////////////////////////////////////////
rc.firewall
echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT DROP
iptables -F
iptables -t nat -F
iptables -X
iptables -t nat -X
iptables -Z
iptables -t nat -Z
for interface in /proc/sys/net/ipv4/conf/*/rp_filter
do
echo 1 > $interface
done
for interface in /proc/sys/net/ipv4/conf/*/send_redirects
do
echo 0 > $interface
done
/sbin/modprobe ip_conntrack_ftp
/sbin/modprobe ip_nat_ftp
iptables -A FORWARD -s 10.1.1.0/24 -j ACCEPT
iptables -A FORWARD -m state --state ESTABLISHED,RELATED -d 10.1.1.0/24 -j ACCEPT
iptables -A FORWARD -m limit -d 10.1.1.0/24 -j LOG --log-prefix "Invalid incoming connection: "
iptables -t nat -A POSTROUTING -s 10.1.1.0/24 -o eth1 -j SNAT --to 203.200.89.1
iptables -A FORWARD -s 192.168.1.0/24 -j ACCEPT
iptables -A FORWARD -m state --state ESTABLISHED,RELATED -d 192.168.1.0/24 -j ACCEPT
iptables -A FORWARD -m limit -d 192.168.1.0/24 -j LOG --log-prefix "Invalid connection to DMZ: "
iptables -t nat -A PREROUTING -d 203.200.89.2 -p tcp --dport 53 -j DNAT --to 192.168.1.2
iptables -t nat -A PREROUTING -d 203.200.89.2 -p udp --dport 53 -j DNAT --to 192.168.1.2
iptables -t nat -A PREROUTING -d 203.200.89.3 -p tcp --dport 25 -j DNAT --to 192.168.1.3
iptables -t nat -A PREROUTING -d 203.200.89.3 -p tcp --dport 110 -j DNAT --to 192.168.1.3
iptables -t nat -A PREROUTING -d 203.200.89.3 -p tcp --dport 143 -j DNAT --to 192.168.1.3
iptables -t nat -A PREROUTING -d 203.200.89.4 -p tcp --dport 80 -j DNAT --to 192.168.1.4
iptables -A OUTPUT -j ACCEPT
iptables -A INPUT -m state --state ESTABLISHED,RELATED -i eth1 -j ACCEPT
iptables -A INPUT -m limit -i eth1 -j LOG --log-prefix "Invalid connect to f-wall: "
iptables -A INPUT -i Eth0 -j ACCEPT
/////////////////////////////////////////////////////////////////////

more sysctl options

# Drop ICMP echo-request messages sent to broadcast or multicast addresses
echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

# Drop source routed packets
echo 0 > /proc/sys/net/ipv4/conf/all/accept_source_route

# Enable TCP SYN cookie protection from SYN floods
echo 1 > /proc/sys/net/ipv4/tcp_syncookies

# Don't accept ICMP redirect messages
echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects

# Don't send ICMP redirect messages
echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects

# Enable source address spoofing protection
echo 1 > /proc/sys/net/ipv4/conf/all/rp_filter

# Log packets with impossible source addresses
echo 1 > /proc/sys/net/ipv4/conf/all/log_martians


.................................................................................................

#!/bin/sh

# Kernel monitoring support
# More information:
# /usr/src/linux-`uname -r`/Documentation/networking/ip-sysctl.txt
# http://www.linuxgazette.com/book/view/1645
# http://www.spirit.com/Network/net0300.html

# Drop ICMP echo-request messages sent to broadcast or multicast addresses
echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

# Drop source routed packets
echo 0 > /proc/sys/net/ipv4/conf/all/accept_source_route

# Enable TCP SYN cookie protection from SYN floods
echo 1 > /proc/sys/net/ipv4/tcp_syncookies

# Don't accept ICMP redirect messages
echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects

# Don't send ICMP redirect messages
echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects

# Enable source address spoofing protection
echo 1 > /proc/sys/net/ipv4/conf/all/rp_filter

# Log packets with impossible source addresses
echo 1 > /proc/sys/net/ipv4/conf/all/log_martians

# Flush all chains
/sbin/iptables --flush

# Allow unlimited traffic on the loopback interface
/sbin/iptables -A INPUT -i lo -j ACCEPT
/sbin/iptables -A OUTPUT -o lo -j ACCEPT

# Set default policies
/sbin/iptables --policy INPUT DROP
/sbin/iptables --policy OUTPUT DROP
/sbin/iptables --policy FORWARD DROP

# Previously initiated and accepted exchanges bypass rule checking
# Allow unlimited outbound traffic
/sbin/iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
/sbin/iptables -A OUTPUT -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT

# Allow incoming TCP port 22 (ssh) traffic from office
/sbin/iptables -A INPUT -p tcp -s 192.168.1.100 --dport 22 -m state --state NEW -j ACCEPT


# Drop all other traffic
/sbin/iptables -A INPUT -j DROP

# Have these rules take effect when iptables is started
/sbin/service iptables save


That is the end of the original script.


If you want to make a syslog entry of dropped packets, change:

# Drop all other traffic
/sbin/iptables -A INPUT -j DROP

To:


# Create a LOGDROP chain to log and drop packets
/sbin/iptables -N LOGDROP
/sbin/iptables -A LOGDROP -j LOG
/sbin/iptables -A LOGDROP -j DROP

# Drop all other traffic
/sbin/iptables -A INPUT -j LOGDROP



You may also want to configure the --log-level to log dropped packets to a separate file instead of /var/log/messages:


# Drop all other traffic
/sbin/iptables -A INPUT -j LOGDROP --log-level debug



/etc/syslog.conf change:

# Send iptables LOGDROPs to /var/log/iptables
kern.=debug /var/log/iptables

Reload the syslogd service for the change to take effect.
/sbin/service syslog reload



If you do not want to allow incoming ssh, remove:


# Allow port 22 (ssh) TCP traffic from office
/sbin/iptables -A INPUT -p tcp -s 192.168.1.100/32 --dport 22 -m state --state NEW -j ACCEPT

/////////////////////////////////////////////////////////////////////////////////////////////////

by default iptables LOG will log into kern.warning, you can simply put:
kern.warning /var/log/iptables.log

#define KERN_EMERG "<0>" /* system is unusable */
#define KERN_ALERT "<1>" /* action must be taken immediately */
#define KERN_CRIT "<2>" /* critical conditions */
#define KERN_ERR "<3>" /* error conditions */
#define KERN_WARNING "<4>" /* warning conditions */
#define KERN_NOTICE "<5>" /* normal but significant condition */
#define KERN_INFO "<6>" /* informational */
#define KERN_DEBUG "<7>" /* debug-level messages */

there is a --log-level parameters on the iptables that you can selectively selec the
level for syslog output

service syslog --full-restart

/////////////////////////////////////////////////////////////////////////////////////////////////

cat /usr/src/linux-2.6.14/documentation/networking/ip-sysctl.txt

/proc/sys/net/ipv4/* Variables:

ip_forward - BOOLEAN
0 - disabled (default)
not 0 - enabled

Forward Packets between interfaces.

This variable is special, its change resets all configuration
parameters to their default state (RFC1122 for hosts, RFC1812
for routers)

ip_default_ttl - INTEGER
default 64

ip_no_pmtu_disc - BOOLEAN
Disable Path MTU Discovery.
default FALSE

min_pmtu - INTEGER
default 562 - minimum discovered Path MTU

mtu_expires - INTEGER
Time, in seconds, that cached PMTU information is kept.

min_adv_mss - INTEGER
The advertised MSS depends on the first hop route MTU, but will
never be lower than this setting.

IP Fragmentation:

ipfrag_high_thresh - INTEGER
Maximum memory used to reassemble IP fragments. When
ipfrag_high_thresh bytes of memory is allocated for this purpose,
the fragment handler will toss packets until ipfrag_low_thresh
is reached.

ipfrag_low_thresh - INTEGER
See ipfrag_high_thresh

ipfrag_time - INTEGER
Time in seconds to keep an IP fragment in memory.

ipfrag_secret_interval - INTEGER
Regeneration interval (in seconds) of the hash secret (or lifetime
for the hash secret) for IP fragments.
Default: 600

INET peer storage:

inet_peer_threshold - INTEGER
The approximate size of the storage. Starting from this threshold
entries will be thrown aggressively. This threshold also determines
entries' time-to-live and time intervals between garbage collection
passes. More entries, less time-to-live, less GC interval.

inet_peer_minttl - INTEGER
Minimum time-to-live of entries. Should be enough to cover fragment
time-to-live on the reassembling side. This minimum time-to-live is
guaranteed if the pool size is less than inet_peer_threshold.
Measured in jiffies(1).

inet_peer_maxttl - INTEGER
Maximum time-to-live of entries. Unused entries will expire after
this period of time if there is no memory pressure on the pool (i.e.
when the number of entries in the pool is very small).
Measured in jiffies(1).

inet_peer_gc_mintime - INTEGER
Minimum interval between garbage collection passes. This interval is
in effect under high memory pressure on the pool.
Measured in jiffies(1).

inet_peer_gc_maxtime - INTEGER
Minimum interval between garbage collection passes. This interval is
in effect under low (or absent) memory pressure on the pool.
Measured in jiffies(1).

TCP variables:

tcp_syn_retries - INTEGER
Number of times initial SYNs for an active TCP connection attempt
will be retransmitted. Should not be higher than 255. Default value
is 5, which corresponds to ~180seconds.

tcp_synack_retries - INTEGER
Number of times SYNACKs for a passive TCP connection attempt will
be retransmitted. Should not be higher than 255. Default value
is 5, which corresponds to ~180seconds.

tcp_keepalive_time - INTEGER
How often TCP sends out keepalive messages when keepalive is enabled.
Default: 2hours.

tcp_keepalive_probes - INTEGER
How many keepalive probes TCP sends out, until it decides that the
connection is broken. Default value: 9.

tcp_keepalive_intvl - INTEGER
How frequently the probes are send out. Multiplied by
tcp_keepalive_probes it is time to kill not responding connection,
after probes started. Default value: 75sec i.e. connection
will be aborted after ~11 minutes of retries.

tcp_retries1 - INTEGER
How many times to retry before deciding that something is wrong
and it is necessary to report this suspicion to network layer.
Minimal RFC value is 3, it is default, which corresponds
to ~3sec-8min depending on RTO.

tcp_retries2 - INTEGER
How may times to retry before killing alive TCP connection.
RFC1122 says that the limit should be longer than 100 sec.
It is too small number. Default value 15 corresponds to ~13-30min
depending on RTO.

tcp_orphan_retries - INTEGER
How may times to retry before killing TCP connection, closed
by our side. Default value 7 corresponds to ~50sec-16min
depending on RTO. If you machine is loaded WEB server,
you should think about lowering this value, such sockets
may consume significant resources. Cf. tcp_max_orphans.

tcp_fin_timeout - INTEGER
Time to hold socket in state FIN-WAIT-2, if it was closed
by our side. Peer can be broken and never close its side,
or even died unexpectedly. Default value is 60sec.
Usual value used in 2.2 was 180 seconds, you may restore
it, but remember that if your machine is even underloaded WEB server,
you risk to overflow memory with kilotons of dead sockets,
FIN-WAIT-2 sockets are less dangerous than FIN-WAIT-1,
because they eat maximum 1.5K of memory, but they tend
to live longer. Cf. tcp_max_orphans.

tcp_max_tw_buckets - INTEGER
Maximal number of timewait sockets held by system simultaneously.
If this number is exceeded time-wait socket is immediately destroyed
and warning is printed. This limit exists only to prevent
simple DoS attacks, you _must_ not lower the limit artificially,
but rather increase it (probably, after increasing installed memory),
if network conditions require more than default value.

tcp_tw_recycle - BOOLEAN
Enable fast recycling TIME-WAIT sockets. Default value is 0.
It should not be changed without advice/request of technical
experts.

tcp_tw_reuse - BOOLEAN
Allow to reuse TIME-WAIT sockets for new connections when it is
safe from protocol viewpoint. Default value is 0.
It should not be changed without advice/request of technical
experts.

tcp_max_orphans - INTEGER
Maximal number of TCP sockets not attached to any user file handle,
held by system. If this number is exceeded orphaned connections are
reset immediately and warning is printed. This limit exists
only to prevent simple DoS attacks, you _must_ not rely on this
or lower the limit artificially, but rather increase it
(probably, after increasing installed memory),
if network conditions require more than default value,
and tune network services to linger and kill such states
more aggressively. Let me to remind again: each orphan eats
up to ~64K of unswappable memory.

tcp_abort_on_overflow - BOOLEAN
If listening service is too slow to accept new connections,
reset them. Default state is FALSE. It means that if overflow
occurred due to a burst, connection will recover. Enable this
option _only_ if you are really sure that listening daemon
cannot be tuned to accept connections faster. Enabling this
option can harm clients of your server.

tcp_syncookies - BOOLEAN
Only valid when the kernel was compiled with CONFIG_SYNCOOKIES
Send out syncookies when the syn backlog queue of a socket
overflows. This is to prevent against the common 'syn flood attack'
Default: FALSE

Note, that syncookies is fallback facility.
It MUST NOT be used to help highly loaded servers to stand
against legal connection rate. If you see synflood warnings
in your logs, but investigation shows that they occur
because of overload with legal connections, you should tune
another parameters until this warning disappear.
See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.

syncookies seriously violate TCP protocol, do not allow
to use TCP extensions, can result in serious degradation
of some services (f.e. SMTP relaying), visible not by you,
but your clients and relays, contacting you. While you see
synflood warnings in logs not being really flooded, your server
is seriously misconfigured.

tcp_stdurg - BOOLEAN
Use the Host requirements interpretation of the TCP urg pointer field.
Most hosts use the older BSD interpretation, so if you turn this on
Linux might not communicate correctly with them.
Default: FALSE

tcp_max_syn_backlog - INTEGER
Maximal number of remembered connection requests, which are
still did not receive an acknowledgment from connecting client.
Default value is 1024 for systems with more than 128Mb of memory,
and 128 for low memory machines. If server suffers of overload,
try to increase this number.

tcp_window_scaling - BOOLEAN
Enable window scaling as defined in RFC1323.

tcp_timestamps - BOOLEAN
Enable timestamps as defined in RFC1323.

tcp_sack - BOOLEAN
Enable select acknowledgments (SACKS).

tcp_fack - BOOLEAN
Enable FACK congestion avoidance and fast retransmission.
The value is not used, if tcp_sack is not enabled.

tcp_dsack - BOOLEAN
Allows TCP to send "duplicate" SACKs.

tcp_ecn - BOOLEAN
Enable Explicit Congestion Notification in TCP.

tcp_reordering - INTEGER
Maximal reordering of packets in a TCP stream.
Default: 3

tcp_retrans_collapse - BOOLEAN
Bug-to-bug compatibility with some broken printers.
On retransmit try to send bigger packets to work around bugs in
certain TCP stacks.

tcp_wmem - vector of 3 INTEGERs: min, default, max
min: Amount of memory reserved for send buffers for TCP socket.
Each TCP socket has rights to use it due to fact of its birth.
Default: 4K

default: Amount of memory allowed for send buffers for TCP socket
by default. This value overrides net.core.wmem_default used
by other protocols, it is usually lower than net.core.wmem_default.
Default: 16K

max: Maximal amount of memory allowed for automatically selected
send buffers for TCP socket. This value does not override
net.core.wmem_max, "static" selection via SO_SNDBUF does not use this.
Default: 128K

tcp_rmem - vector of 3 INTEGERs: min, default, max
min: Minimal size of receive buffer used by TCP sockets.
It is guaranteed to each TCP socket, even under moderate memory
pressure.
Default: 8K

default: default size of receive buffer used by TCP sockets.
This value overrides net.core.rmem_default used by other protocols.
Default: 87380 bytes. This value results in window of 65535 with
default setting of tcp_adv_win_scale and tcp_app_win:0 and a bit
less for default tcp_app_win. See below about these variables.

max: maximal size of receive buffer allowed for automatically
selected receiver buffers for TCP socket. This value does not override
net.core.rmem_max, "static" selection via SO_RCVBUF does not use this.
Default: 87380*2 bytes.

tcp_mem - vector of 3 INTEGERs: min, pressure, max
low: below this number of pages TCP is not bothered about its
memory appetite.

pressure: when amount of memory allocated by TCP exceeds this number
of pages, TCP moderates its memory consumption and enters memory
pressure mode, which is exited when memory consumption falls
under "low".

high: number of pages allowed for queueing by all TCP sockets.

Defaults are calculated at boot time from amount of available
memory.

tcp_app_win - INTEGER
Reserve max(window/2^tcp_app_win, mss) of window for application
buffer. Value 0 is special, it means that nothing is reserved.
Default: 31

tcp_adv_win_scale - INTEGER
Count buffering overhead as bytes/2^tcp_adv_win_scale
(if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale),
if it is <= 0.
Default: 2

tcp_rfc1337 - BOOLEAN
If set, the TCP stack behaves conforming to RFC1337. If unset,
we are not conforming to RFC, but prevent TCP TIME_WAIT
assassination.
Default: 0

tcp_low_latency - BOOLEAN
If set, the TCP stack makes decisions that prefer lower
latency as opposed to higher throughput. By default, this
option is not set meaning that higher throughput is preferred.
An example of an application where this default should be
changed would be a Beowulf compute cluster.
Default: 0

tcp_tso_win_divisor - INTEGER
This allows control over what percentage of the congestion window
can be consumed by a single TSO frame.
The setting of this parameter is a choice between burstiness and
building larger TSO frames.
Default: 8

tcp_frto - BOOLEAN
Enables F-RTO, an enhanced recovery algorithm for TCP retransmission
timeouts. It is particularly beneficial in wireless environments
where packet loss is typically due to random radio interference
rather than intermediate router congestion.

tcp_congestion_control - STRING
Set the congestion control algorithm to be used for new
connections. The algorithm "reno" is always available, but
additional choices may be available based on kernel configuration.

somaxconn - INTEGER
Limit of socket listen() backlog, known in userspace as SOMAXCONN.
Defaults to 128. See also tcp_max_syn_backlog for additional tuning
for TCP sockets.

IP Variables:

ip_local_port_range - 2 INTEGERS
Defines the local port range that is used by TCP and UDP to
choose the local port. The first number is the first, the
second the last local port number. Default value depends on
amount of memory available on the system:
> 128Mb 32768-61000
< 128Mb 1024-4999 or even less.
This number defines number of active connections, which this
system can issue simultaneously to systems not supporting
TCP extensions (timestamps). With tcp_tw_recycle enabled
(i.e. by default) range 1024-4999 is enough to issue up to
2000 connections per second to systems supporting timestamps.

ip_nonlocal_bind - BOOLEAN
If set, allows processes to bind() to non-local IP addresses,
which can be quite useful - but may break some applications.
Default: 0

ip_dynaddr - BOOLEAN
If set non-zero, enables support for dynamic addresses.
If set to a non-zero value larger than 1, a kernel log
message will be printed when dynamic address rewriting
occurs.
Default: 0

icmp_echo_ignore_all - BOOLEAN
If set non-zero, then the kernel will ignore all ICMP ECHO
requests sent to it.
Default: 0

icmp_echo_ignore_broadcasts - BOOLEAN
If set non-zero, then the kernel will ignore all ICMP ECHO and
TIMESTAMP requests sent to it via broadcast/multicast.
Default: 1

icmp_ratelimit - INTEGER
Limit the maximal rates for sending ICMP packets whose type matches
icmp_ratemask (see below) to specific targets.
0 to disable any limiting, otherwise the maximal rate in jiffies(1)
Default: 100

icmp_ratemask - INTEGER
Mask made of ICMP types for which rates are being limited.
Significant bits: IHGFEDCBA9876543210
Default mask: 0000001100000011000 (6168)

Bit definitions (see include/linux/icmp.h):
0 Echo Reply
3 Destination Unreachable *
4 Source Quench *
5 Redirect
8 Echo Request
B Time Exceeded *
C Parameter Problem *
D Timestamp Request
E Timestamp Reply
F Info Request
G Info Reply
H Address Mask Request
I Address Mask Reply

* These are rate limited by default (see default mask above)

icmp_ignore_bogus_error_responses - BOOLEAN
Some routers violate RFC1122 by sending bogus responses to broadcast
frames. Such violations are normally logged via a kernel warning.
If this is set to TRUE, the kernel will not give such warnings, which
will avoid log file clutter.
Default: FALSE

igmp_max_memberships - INTEGER
Change the maximum number of multicast groups we can subscribe to.
Default: 20

conf/interface/* changes special settings per interface (where "interface" is
the name of your network interface)
conf/all/* is special, changes the settings for all interfaces


log_martians - BOOLEAN
Log packets with impossible addresses to kernel log.
log_martians for the interface will be enabled if at least one of
conf/{all,interface}/log_martians is set to TRUE,
it will be disabled otherwise

accept_redirects - BOOLEAN
Accept ICMP redirect messages.
accept_redirects for the interface will be enabled if:
- both conf/{all,interface}/accept_redirects are TRUE in the case forwarding
for the interface is enabled
or
- at least one of conf/{all,interface}/accept_redirects is TRUE in the case
forwarding for the interface is disabled
accept_redirects for the interface will be disabled otherwise
default TRUE (host)
FALSE (router)

forwarding - BOOLEAN
Enable IP forwarding on this interface.

mc_forwarding - BOOLEAN
Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE
and a multicast routing daemon is required.
conf/all/mc_forwarding must also be set to TRUE to enable multicast routing
for the interface

medium_id - INTEGER
Integer value used to differentiate the devices by the medium they
are attached to. Two devices can have different id values when
the broadcast packets are received only on one of them.
The default value 0 means that the device is the only interface
to its medium, value of -1 means that medium is not known.

Currently, it is used to change the proxy_arp behavior:
the proxy_arp feature is enabled for packets forwarded between
two devices attached to different media.

proxy_arp - BOOLEAN
Do proxy arp.
proxy_arp for the interface will be enabled if at least one of
conf/{all,interface}/proxy_arp is set to TRUE,
it will be disabled otherwise

shared_media - BOOLEAN
Send(router) or accept(host) RFC1620 shared media redirects.
Overrides ip_secure_redirects.
shared_media for the interface will be enabled if at least one of
conf/{all,interface}/shared_media is set to TRUE,
it will be disabled otherwise
default TRUE

secure_redirects - BOOLEAN
Accept ICMP redirect messages only for gateways,
listed in default gateway list.
secure_redirects for the interface will be enabled if at least one of
conf/{all,interface}/secure_redirects is set to TRUE,
it will be disabled otherwise
default TRUE

send_redirects - BOOLEAN
Send redirects, if router.
send_redirects for the interface will be enabled if at least one of
conf/{all,interface}/send_redirects is set to TRUE,
it will be disabled otherwise
Default: TRUE

bootp_relay - BOOLEAN
Accept packets with source address 0.b.c.d destined
not to this host as local ones. It is supposed, that
BOOTP relay daemon will catch and forward such packets.
conf/all/bootp_relay must also be set to TRUE to enable BOOTP relay
for the interface
default FALSE
Not Implemented Yet.

accept_source_route - BOOLEAN
Accept packets with SRR option.
conf/all/accept_source_route must also be set to TRUE to accept packets
with SRR option on the interface
default TRUE (router)
FALSE (host)

rp_filter - BOOLEAN
1 - do source validation by reversed path, as specified in RFC1812
Recommended option for single homed hosts and stub network
routers. Could cause troubles for complicated (not loop free)
networks running a slow unreliable protocol (sort of RIP),
or using static routes.

0 - No source validation.

conf/all/rp_filter must also be set to TRUE to do source validation
on the interface

Default value is 0. Note that some distributions enable it
in startup scripts.

arp_filter - BOOLEAN
1 - Allows you to have multiple network interfaces on the same
subnet, and have the ARPs for each interface be answered
based on whether or not the kernel would route a packet from
the ARP'd IP out that interface (therefore you must use source
based routing for this to work). In other words it allows control
of which cards (usually 1) will respond to an arp request.

0 - (default) The kernel can respond to arp requests with addresses
from other interfaces. This may seem wrong but it usually makes
sense, because it increases the chance of successful communication.
IP addresses are owned by the complete host on Linux, not by
particular interfaces. Only for more complex setups like load-
balancing, does this behaviour cause problems.

arp_filter for the interface will be enabled if at least one of
conf/{all,interface}/arp_filter is set to TRUE,
it will be disabled otherwise

arp_announce - INTEGER
Define different restriction levels for announcing the local
source IP address from IP packets in ARP requests sent on
interface:
0 - (default) Use any local address, configured on any interface
1 - Try to avoid local addresses that are not in the target's
subnet for this interface. This mode is useful when target
hosts reachable via this interface require the source IP
address in ARP requests to be part of their logical network
configured on the receiving interface. When we generate the
request we will check all our subnets that include the
target IP and will preserve the source address if it is from
such subnet. If there is no such subnet we select source
address according to the rules for level 2.
2 - Always use the best local address for this target.
In this mode we ignore the source address in the IP packet
and try to select local address that we prefer for talks with
the target host. Such local address is selected by looking
for primary IP addresses on all our subnets on the outgoing
interface that include the target IP address. If no suitable
local address is found we select the first local address
we have on the outgoing interface or on all other interfaces,
with the hope we will receive reply for our request and
even sometimes no matter the source IP address we announce.

The max value from conf/{all,interface}/arp_announce is used.

Increasing the restriction level gives more chance for
receiving answer from the resolved target while decreasing
the level announces more valid sender's information.

arp_ignore - INTEGER
Define different modes for sending replies in response to
received ARP requests that resolve local target IP addresses:
0 - (default): reply for any local target IP address, configured
on any interface
1 - reply only if the target IP address is local address
configured on the incoming interface
2 - reply only if the target IP address is local address
configured on the incoming interface and both with the
sender's IP address are part from same subnet on this interface
3 - do not reply for local addresses configured with scope host,
only resolutions for global and link addresses are replied
4-7 - reserved
8 - do not reply for all local addresses

The max value from conf/{all,interface}/arp_ignore is used
when ARP request is received on the {interface}

app_solicit - INTEGER
The maximum number of probes to send to the user space ARP daemon
via netlink before dropping back to multicast probes (see
mcast_solicit). Defaults to 0.

disable_policy - BOOLEAN
Disable IPSEC policy (SPD) for this interface

disable_xfrm - BOOLEAN
Disable IPSEC encryption on this interface, whatever the policy



tag - INTEGER
Allows you to write a number, which can be used as required.
Default value is 0.

(1) Jiffie: internal timeunit for the kernel. On the i386 1/100s, on the
Alpha 1/1024s. See the HZ define in /usr/include/asm/param.h for the exact
value on your system.

Alexey Kuznetsov.
kuznet@ms2.inr.ac.ru

Updated by:
Andi Kleen
ak@muc.de
Nicolas Delon
delon.nicolas@wanadoo.fr




/proc/sys/net/ipv6/* Variables:

IPv6 has no global variables such as tcp_*. tcp_* settings under ipv4/ also
apply to IPv6 [XXX?].

bindv6only - BOOLEAN
Default value for IPV6_V6ONLY socket option,
which restricts use of the IPv6 socket to IPv6 communication
only.
TRUE: disable IPv4-mapped address feature
FALSE: enable IPv4-mapped address feature

Default: FALSE (as specified in RFC2553bis)

IPv6 Fragmentation:

ip6frag_high_thresh - INTEGER
Maximum memory used to reassemble IPv6 fragments. When
ip6frag_high_thresh bytes of memory is allocated for this purpose,
the fragment handler will toss packets until ip6frag_low_thresh
is reached.

ip6frag_low_thresh - INTEGER
See ip6frag_high_thresh

ip6frag_time - INTEGER
Time in seconds to keep an IPv6 fragment in memory.

ip6frag_secret_interval - INTEGER
Regeneration interval (in seconds) of the hash secret (or lifetime
for the hash secret) for IPv6 fragments.
Default: 600

conf/default/*:
Change the interface-specific default settings.


conf/all/*:
Change all the interface-specific settings.

[XXX: Other special features than forwarding?]

conf/all/forwarding - BOOLEAN
Enable global IPv6 forwarding between all interfaces.

IPv4 and IPv6 work differently here; e.g. netfilter must be used
to control which interfaces may forward packets and which not.

This also sets all interfaces' Host/Router setting
'forwarding' to the specified value. See below for details.

This referred to as global forwarding.

conf/interface/*:
Change special settings per interface.

The functional behaviour for certain settings is different
depending on whether local forwarding is enabled or not.

accept_ra - BOOLEAN
Accept Router Advertisements; autoconfigure using them.

Functional default: enabled if local forwarding is disabled.
disabled if local forwarding is enabled.

accept_redirects - BOOLEAN
Accept Redirects.

Functional default: enabled if local forwarding is disabled.
disabled if local forwarding is enabled.

autoconf - BOOLEAN
Autoconfigure addresses using Prefix Information in Router
Advertisements.

Functional default: enabled if accept_ra is enabled.
disabled if accept_ra is disabled.

dad_transmits - INTEGER
The amount of Duplicate Address Detection probes to send.
Default: 1

forwarding - BOOLEAN
Configure interface-specific Host/Router behaviour.

Note: It is recommended to have the same setting on all
interfaces; mixed router/host scenarios are rather uncommon.

FALSE:

By default, Host behaviour is assumed. This means:

1. IsRouter flag is not set in Neighbour Advertisements.
2. Router Solicitations are being sent when necessary.
3. If accept_ra is TRUE (default), accept Router
Advertisements (and do autoconfiguration).
4. If accept_redirects is TRUE (default), accept Redirects.

TRUE:

If local forwarding is enabled, Router behaviour is assumed.
This means exactly the reverse from the above:

1. IsRouter flag is set in Neighbour Advertisements.
2. Router Solicitations are not sent.
3. Router Advertisements are ignored.
4. Redirects are ignored.

Default: FALSE if global forwarding is disabled (default),
otherwise TRUE.

hop_limit - INTEGER
Default Hop Limit to set.
Default: 64

mtu - INTEGER
Default Maximum Transfer Unit
Default: 1280 (IPv6 required minimum)

router_solicitation_delay - INTEGER
Number of seconds to wait after interface is brought up
before sending Router Solicitations.
Default: 1

router_solicitation_interval - INTEGER
Number of seconds to wait between Router Solicitations.
Default: 4

router_solicitations - INTEGER
Number of Router Solicitations to send until assuming no
routers are present.
Default: 3

use_tempaddr - INTEGER
Preference for Privacy Extensions (RFC3041).
<= 0 : disable Privacy Extensions
== 1 : enable Privacy Extensions, but prefer public
addresses over temporary addresses.
> 1 : enable Privacy Extensions and prefer temporary
addresses over public addresses.
Default: 0 (for most devices)
-1 (for point-to-point devices and loopback devices)

temp_valid_lft - INTEGER
valid lifetime (in seconds) for temporary addresses.
Default: 604800 (7 days)

temp_prefered_lft - INTEGER
Preferred lifetime (in seconds) for temporary addresses.
Default: 86400 (1 day)

max_desync_factor - INTEGER
Maximum value for DESYNC_FACTOR, which is a random value
that ensures that clients don't synchronize with each
other and generate new addresses at exactly the same time.
value is in seconds.
Default: 600

regen_max_retry - INTEGER
Number of attempts before give up attempting to generate
valid temporary addresses.
Default: 5

max_addresses - INTEGER
Number of maximum addresses per interface. 0 disables limitation.
It is recommended not set too large value (or 0) because it would
be too easy way to crash kernel to allow to create too much of
autoconfigured addresses.
Default: 16

icmp/*:
ratelimit - INTEGER
Limit the maximal rates for sending ICMPv6 packets.
0 to disable any limiting, otherwise the maximal rate in jiffies(1)
Default: 100


IPv6 Update by:
Pekka Savola
YOSHIFUJI Hideaki / USAGI Project


/proc/sys/net/bridge/* Variables:

bridge-nf-call-arptables - BOOLEAN
1 : pass bridged ARP traffic to arptables' FORWARD chain.
0 : disable this.
Default: 1

bridge-nf-call-iptables - BOOLEAN
1 : pass bridged IPv4 traffic to iptables' chains.
0 : disable this.
Default: 1

bridge-nf-call-ip6tables - BOOLEAN
1 : pass bridged IPv6 traffic to ip6tables' chains.
0 : disable this.
Default: 1

bridge-nf-filter-vlan-tagged - BOOLEAN
1 : pass bridged vlan-tagged ARP/IP traffic to arptables/iptables.
0 : disable this.
Default: 1


UNDOCUMENTED:

dev_weight FIXME
discovery_slots FIXME
discovery_timeout FIXME
fast_poll_increase FIXME
ip6_queue_maxlen FIXME
lap_keepalive_time FIXME
lo_cong FIXME
max_baud_rate FIXME
max_dgram_qlen FIXME
max_noreply_time FIXME
max_tx_data_size FIXME
max_tx_window FIXME
min_tx_turn_time FIXME
mod_cong FIXME
no_cong FIXME
no_cong_thresh FIXME
slot_timeout FIXME
warn_noreply_time FIXME

$Id: ip-sysctl.txt,v 1.20 2001/12/13 09:00:18 davem Exp $



..................................................................................................

............................................................................................

the nfs setting I set while at singlefin.net 2.4 kernels debian linux
this did wonders!!

/etc/fstab

"nfsserver:/mnt/export /mnt/export nfs rw,hard,intr,rsize=8192,wsize=8192 0 0"

test with
# ./iozone -a -R -c -U /mnt/export -f /mnt/export/testfile > file.log


http://nfs.sourceforge.net/nfs-howto/performance.html

Sunday, January 01, 2006

 

Performance Tuning for Linux Servers

Performance Tuning for Linux Servers

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------


Installation
--------------------------------------------------------------------
Use separate partitions for root( / ), swap, /var, /usr, and /home

Most drives today pack more sectors on the outer tracks of the hard drive platter than on the inner tracks, so it’s much faster to read and write data from the outer tracks. Lower-numbered partitions are usually allocated at the outer tracks (for example, /dev/hda1 is closer to the drive’s outer edge than /dev/hda3), so place partitions that require frequent access first.
? http://www.pcguide.com/ref/hdd/geom/tracksZBR-c.html ?
The first partition should be the swap partition (to optimize memory swap operations).

The next partition should be /var because log entries are frequently written to /var/log.

The next partition should be /usr, because base system utilities and commands are placed in /usr.

The root and /home partitions can reside near the end of the drive.

USE MULTIPLE DRIVES
-Place frequently accessed partitions on the faster drives !duh, ofcourse!
-Place frequently accessed partitions (ie /var and /usr) on separate drives.
-Use RAID
-(IDE) Place each drive as master device on its own I/O channel


ext3 is journaling
convert ext2 to ext3
tune2fs -j /dev/hda1

RieserFS - Best performance with small files
xfs - Best performance, especially with large files


RAID
mkraid –V
cat /proc/mdstat
Create or modify /etc/raidtab
#---
/* Create RAID device md0 */
raiddev /dev/md 0 /* New RAID device */
raid-level 0 /* RAID 0 as example here */
nr-raid-disk 2 /* Assume two disks */
/* Automatically detect RAID devices on boot */
persistent-superblock 1
chunk-size 32 /* Writes 32 KB of data to each disk */
device /dev/hda1
raid-disk 0
device /dev/hdc1
raid-disk 1
#---

mkraid /dev/md0
mkreiserfs /dev/md0

IDE DISKS
Verify that DMA is enabled:
hdparm –d /dev/hda

If DMS is not enabled, enable it by issuing the following command:
hdparm –d 1 /dev/hda

Verify that 32-bit transfers are enabled:
hdparm –c /dev/hda

If 32-bit transfers are not enabled, enable them by issuing the following command:
hdparm –c 1 /dev/hda

Verify the effectiveness of the options by running simple disk read tests as follows:
hdparm –T -t /dev/hds

--------------------------------------------------------------------
--------------------------------------------------------------------


2.6 Kernel Features:
--------------------------------------------------------------------
I/O Elevators - anticipatory and deadline
An elevator is a queue where I/O requests are ordered by the function of their sector on disk
default = anticipatory //anticipating the “next” read operation
database applications seek all over the disk, performing reads and synchronous writes suffer with anticipatory I/O
10% performance improvement over the anticipatory scheduler, Select the deadline I/O scheduler by booting with elevator = deadline on the kernel command line


Huge TLB Page Support -
!!KJR TLB=TransLation Buffer!!
TLB is the processor’s cache of virtual-to-physical memory address translations
large TLB entry can map a 2MB or 4MB page, thus reducing the number of TLB misses
TLB miss is very costly in terms of processor cycles
mmap system calls or shared memory system calls
kernel config:
CONFIG_HUGETLB_PAGE (under processor section)
CONFIG_ HUGETLBFS (under file system section)
cat /proc/meminfo #show huge page size support
cat /proc/filesystem #hugetlbfs
cat /proc/sys/vm/nr_hugepages #configured huge pages

tune with:
echo x >/proc/sys/vm/nr_hugepages
x is the number of huge pages to be preallocated in megabytes

--------------------------------------------------------------------
--------------------------------------------------------------------


Logging Facility:
--------------------------------------------------------------------
!duh, ofcourse!
/var/log/messages
/var/log/XFree86.0.log

man logger
Logger makes entries in the system log

/etc/syslog.conf
/etc/sysconfig/syslog (RedHat)
kern.* /var/adm/kernel
kern.crit @remotehost #KJR says /etc/hosts loghost
kern.crit /dev/console
kern.info;kern.!err /var/adm/kernel-info

--------------------------------------------------------------------
--------------------------------------------------------------------


System Initialization
--------------------------------------------------------------------
init read /etc/inittab

BSD -
/etc/rc.d/rc.S - single file rc.S //daemons started
rc.S file enables the system’s virtual memory, mounts necessary file systems, cleans up certain log directories, initializes Plug and Play devices, loads kernel modules, configures PCMCIA devices, and sets up serial ports.
local script (rc.local) is available

System V - multiple independent files runlevel is given a subdirectory
scripts are run from runlevels 0 to 6
/etc/rc.d rc0.d to rc6.d and init.d
links to the master scripts stored in /etc/rc.d/init.d
K is kill
S is start
scripts run in numeric order


Initialization Table (/etc/inittab)
id:runlevel:action:process

id =
unique identifier

runlevel =
0=halt, 1=single, 2=multiuser w/0 nfs, 3=multiuser full, 4=unused, 5=X, 6=reboot

action =
respawn, once, sysinit, boot, bootwait, wait, off, ondemand, initdefault, powerwait, powerfail, powerokwait, ctrlaltdel, or kbrequest

process =
specific process or program to run

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

Kernel Overview

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
Linus Torvalds in 1991 on Intel 80x86 processor !duh, ofcourse!
kernel = interact/control system hardware components !duh, ofcourse!
kernel = provide an environment in which applications can run !duh, ofcourse!

Linux kernel is monolithic
Linux kernels extended by modules
module is an object that can be linked to the kernel at runtime

microkernel operating systems provide bare, minimal functionality, and all other operating system layers are performed on top of microkernels as processes
microkernels are slow due to message passing between the various layers

--------------------------------------------------------------------

/proc File System
virtual file system that is created dynamically by the kernel
provide data and to fine-tune

--------------------------------------------------------------------

Memory Management
address space, physical memory, memory mapping, paging, and swapping

Address Space - "virtual memory" space is mapped to physical memory
address space is a flat linear address space
linear address space is divided into two parts: user address space and kernel address space
x86 32-bit architecture supports 4GB address space
3GB is reserved for user space and 1GB is reserved for the kernel
location of the split is determined by the PAGE_OFFSET kernel configuration variable

!!KJR VM=Virtual Memory or Manager!!
Physical Memory - VM represents this arrangement as a node
Each node is divided into a number of blocks called zones that represent ranges within memory
ZONE_DMA - First 16MB of memory
ZONE_NORMAL - 16MB – 896MB
ZONE_HIGHMEM - 896MB – end

Memory Mapping
kernel has only 1GB of virtual address space for its use
3GB is reserved for the kernel
Intel PAE (Physical Address Extension) Pentium processors support up to 64GB of physical memory
kernel address a page in high memory, it maps that page into a small virtual address space (kmap) window, operates on that page, and unmaps the page
64-bit architectures do not have this problem because their address space is huge

Paging
Virtual address space is divided into fixed-size chunks called pages
three-level paging mechanism
Page Global Directory (PGD)
Page Middle Directory (PMD)
Page Table (PTE)

Swapping
Swapping is the moving of an entire process to and from secondary storage when the main memory is low
context switches are very expensive
swapping is performed at the page level rather than at the process level
major disadvantage of swapping is speed - disks are very slow

--------------------------------------------------------------------

Processes, Tasks, and Kernel Threads
task is simply a generic “description of work that needs to be done,” whether it is a lightweight thread or a full process
thread is the most lightweight instance of a task
process is a “heavier” data structure, Several threads can operate within a single process
kernel thread is a thread that always operates in kernel mode and has no user context

Threads and processes are scheduled identically by the scheduler

--------------------------------------------------------------------

Scheduling and Context Switching
Linux scheduler principle:
slow-running processes are better than processes that stop dead in their tracks either due to deliberate choices in scheduling policies or outright bugs

context switch = process stops running and another replaces it
overhead for this is high
timeslice = period of time in which to run

--
example:
a disk with data ready causes an interrupt
kernel calls the interrupt handler
interrupting the process that is currently running
utilizing many of its resources
currently running process resumes
effect steals time from the currently running process
--

Interrupt handlers are usually very fast and compact and thereby handle and clear interrupts quickly
an interrupt utilizes a random process’s resources

--------------------------------------------------------------------

Interprocess Communications (IPC) ipcs

Signals - job control
SIGSTOP signal causes a process to halt its execution
SIGKILL signal causes a process to exit and be ignored


pipe is a unidirectional, first-in first-out (FIFO), unstructured stream of data
named pipes are not temporary objects; they are entities in the file system and can be created using the mkfifo command

##file perms## ## ## ## ## ## ## ##
##########################################
from the 'man ls' on apple OSX:

b Block special file.
c Character special file.
d Directory.
l Symbolic link.
s Socket link.
p FIFO.
- Regular file.

!!KJR - it really perturbs me that linux man pages do not have this. Typically I'd have to shell into a solaris box to get this kind of info. Thank you apple OSX BSD flavor!!
d = directory
l = symbolic link
s = socket
p = named pipe
- = regular file
c = character (unbuffered) device file special
b = block (buffered) device file special

umask 0777 or 0666 - number
default file creation 666
default directory creation 777

# umask
0022
# touch file
# mkdir dir
# ls -al
drwxr-xr-x 2 root root 4096 Dec 31 15:05 dir
-rw-r--r-- 1 root root 0 Dec 31 15:05 file

1 is set sticky bit
2 is set gid
4 is set uid

setuid is a security vulnerabilies because it runs the process as the owner of the file

ie. /tmp has sticky bit. noted by trailing "t" anything created in this dir retains ownership of the original owner.
drwxrwxrwt 18 root root 4096 Dec 31 15:04 /tmp

!!KJR - They get more (S|s)quirrely:!!

The next three fields are three characters each: owner permissions, group
permissions, and other permissions. Each field has three character posi-
tions:

1. If r, the file is readable; if -, it is not readable.

2. If w, the file is writable; if -, it is not writable.

3. The first of the following that applies:

S If in the owner permissions, the file is not exe-
cutable and set-user-ID mode is set. If in the
group permissions, the file is not executable and
set-group-ID mode is set.

s If in the owner permissions, the file is exe-
cutable and set-user-ID mode is set. If in the
group permissions, the file is executable and set-
group-ID mode is set.

x The file is executable or the directory is search-
able.

- The file is neither readable, writable, exe-
cutable, nor set-user-ID nor set-group-ID mode,
nor sticky. (See below.)

These next two apply only to the third character in the last
group (other permissions).

T The sticky bit is set (mode 1000), but not execute
or search permission. (See chmod(1) or
sticky(8).)

t The sticky bit is set (mode 1000), and is search-
able or executable. (See chmod(1) or sticky(8).)

http://www.comptechdoc.org/os/linux/usersguide/linux_ugfilesp.html
##########################################
##file perms## ## ## ## ## ## ## ##

System V IPC Mechanisms

Message Queues - allow one or more processes to write messages
message queues are equivalent to pipes
Message queues pass data in messages rather than as an unformatted stream of bytes, allowing data to be processed easily
messages can be associated with a type, so the receiver can check for urgent messages before processing non-urgent messages

Semaphores - objects that support two atomic operations: set and test
counters that control access to shared resources by multiple processes
used as a locking mechanism to prevent processes from accessing a particular resource while another process is
problem = deadlocking, occurs when one process has altered a semaphore’s value as it enters a critical region but then fails to leave the critical region because it crashed or was killed
Linux protects by maintaining lists of adjustments to the semaphore arrays

Shared Memory - one or more processes to communicate via memory that appears in all of their virtual address spaces
Access to shared memory areas is controlled through keys and access rights checking

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

Processors and Multiprocessing
--------------------------------------------------------------------
--------------------------------------------------------------------
16 processors on 2.4-based kernels
32 processors on 2.6-based kernels
up to 512 processors on some architectures ??x86_64??

NUMA (Nonuniform Memory Access)
Non-Uniform Memory Architecture (NUMA).

or

Cluster
high-performance clusters (HPCs) higher node count 100+
spreading the work across a large number of nodes
Each node in an HPC has its own local disk storage to maintain the operating system, provide swap space, store programs

high-availability clusters 2-16
operates as an enterprise server
HA cluster consists minimally of two independent computers with a “heartbeat” monitoring program that monitors the health of the other node(s) in the cluster
http://linux-ha.org

--------------------------------------------------------------------
Symmetrical Multiprocessing (SMP)


Loosely coupled systems consist of processors that operate stand-alone
-Each processor has its own bus, memory, and I/O subsystem, and communicates with other processors through the network medium

Tightly coupled systems consist of processors that share the memory, bus, devices, and sometimes cache
-run a single instance of the operating system

!!KJR - Myth Buster = no SMP system is 100% scalable is because of the overhead involved in maintaining additional processors!!

--------------------------------------------------------------------
Symmetric Multithreading (SMT)
single physical CPU appears as two or more virtual CPUs
virtual CPUs share the core resources of the physical processor
Symmetric multithreading allows two or more tasks to be executed simultaneously in the processor
it has scheduler implications




--------------------------------------------------------------------

File Systems
--------------------------------------------------------------------

Virtual File System (VFS)
Virtual file system (VFS) allows Linux to support many, often very different, file systems, each presenting a common software interface to the VFS
Virtual File System layer allows you to transparently mount many different file systems at the same time

ext2fs

LVM - Logical Volume Manager
volume manager is used to hide the physical storage characteristics from the file systems and higher-level applications

RAID - Redundant Array of Inexpensive Disks
RAID-Linear = concatenation
RAID-0 = striping
RAID-1 = mirroring
RAID-5 = striping with parity
!! KJR RAID TEN is zero across one, striping across multiple mirroring !!

devfs - virtual device file system. ie, like procfs
/dev
Device drivers can register devices to devfs through device names instead of through the traditional major-minor number scheme
namespace is not limited by the number of major and minor numbers

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

--------------------------------------------------------------------
--------------------------------------------------------------------

Memory
32-bit processors have a 4GB limit on memory addressability (2 raised to the 32nd power)
64-bit processors maximum address (2 raised to the 64th power)

32-bit processors (Pentium) implement additional address bits for accessing physical addresses greater than 32 bits
via virtual addressing by use of additional bits in page table entries
x86-based processors currently support up to 64GB of physical memory through this mechanism
virtual addressability is still restricted to 4GB

--------------------------------------------------------------------
I/O
disks limited to 256 in the 2.4 kernel series
Multipath I/O (MPIO) provides more than one path to a storage device


--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

System Performance Monitoring
--------------------------------------------------------------------

CPU Utilization
cat /proc/cpuinfo

uptime
iostat
vmstat
top
sar //from sysstat pkg !!KJR - System Analysis Reporting || System And Reporting!!

load average represents the average number of tasks that could be run over a period of 1, 5, and 15
linux load average
http://www.teamquest.com/resources/gunther/ldavg1.shtml



---
# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 0 5336 40564 134016 0 0 3 30 325 233 5 2 92 0

--procs
r = running processes
b = blocked processes

--memory
swpd = memory swapped out #linux kswapd
free = free memory
buff = buffer cache for I/O data
cache = memory for file reads on disk in kilobytes

--swap
si = memory swapped in from disk #linux page fault activities as pages are swapped back to physical mem.
so = memory swapped out to disk in kilobytes per second

--io
bi = block read in from devices
bo = block written out to devices

--system
in = interrupts
cs = context switches

--cpu
us = user
sy = system
id = true idleness
wa = waiting for I/O completion
---

Also Look At: /proc/irq/ID
if 0x0001 is echoed to /proc/irq/ID, where ID corresponds to a device, only CPU 0 will process IRQ for this device

--------------------------------------------------------------------
Memory Utilization

cat /proc/meminfo
cat /proc/slabinfo

/proc/meminfo
MemTotal = total amount of physical memory of the system
MemFree = total amount of unused memory
Buffers = buffer cache for I/O operations
Cached = memory reading files from disk
SwapCached = amount of cache memory that has been swapped out in the swap space
SwapTotal = amount of disk memory for swapping purposes
HighTotal = memory greater than ~860MB of the physical memory
LowTotal = memory used by the kernel
Mapped = files that are memory-mapped
Slab = memory used for the kernel data structures

If an IA32-based system has more than 1GB of physical memory, HighTotal is nonzero

/proc/slabinfo
tcp_bind_bucket 56 224 32 2 2 1

first column lists the names of the kernel data structures
56 of which are active
total of 224 tcp_bind_bucket
Each data structure takes up 32 bytes
There are two pages that have at least one active object,
and there is a total of two allocated pages

--------------------------------------------------------------------
ps aux
!!KJR aux is better that -ef!!

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

root 1 0.0 0.0 1528 528 ? S 15:24 0:00 init [2]

%CPU =
%MEM = total percentage of system memory that each process consumes
VSZ = virtual memory footprint
RSS = amount of physical memory that the process is currently using

/proc/pid/maps
layout of the processes virtual address space.
where pid is the process ID of a particular process
cat /proc/3162/maps


Check It Tips:

monitor I/O workloads
vmstat: bi and bo transfe rate

see whether the system is swapping
monitor if system is swapping swpd, si, and so
If so, you can check on the swapping rate

monitor CPU utilization of us, sy, id, and wa.
If wa is large, you need to examine the I/O subsystem

--------------------------------------------------------------------

I/O Utilization

iostat
sar

iostat reports CPU utilization similar to how it is provided by the top tool
splits the CPU time into user, nice, system, I/O wait, and system idle

--------------------------------------------------------------------

Network Utilization

netstat, nfsstat, tcpdump, ethtool, snmp, ifport, ifconfig, route, arp, ping, traceroute, host, and nslookup
!!KJR what about mii-tool !!
!!KJR what about ip !!
!!KJR what about tc !!

ping (ICMP) = Internet Control Message Protocol - Echo function
A small packet is sent through the network for a given IP address
icmp type 255 = any
http://www.iana.org/assignments/icmp-parameters

route !!no duh!!
ie. route add default gw 192.168.0.1 //adds a default gateway

Flags Possible flags include
U (route is up)
H (target is a host)
G (use gateway)
R (reinstate route for dynamic routing)
D (dynamically installed by daemon or redirect)
M (modified from routing daemon or redirect)
A (installed by addrconf)
C (cache entry)
! (reject route)


arp !!no duh!! Address Resolution Protocol
ie. arp -d hostname //deletes arp entry of 'hostname' from arp table
Flags are same as from route

traceroute - find hops

tcpdump - sniffs network packets

host is a tool used to retrieve the host name for a given IP address from the Domain Name System

#network traffic
netstat -i
netstat -s
ip -s link


netstat -rn // shows routes
netstat -nlut // show open ports


ifconfig eth0:1 creates an alias

MAC stands for Media Access Control !!KJR Not Machine Address Code like most think!!
six hexadecimal numbers
ifconfig eth0 down hw ether 00:00:00:00:00:01
ifconfig eth0 up

nfsstat
network file system

--------------------------------------------------------------------
--------------------------------------------------------------------
System Trace Tools
identify performance problems and bottlenecks

top

strace

Oprofile
opcontrol initializes the OProfile tool
oprof_start is a GUI interface
oprofpp produces reports
op_time produces summary reports relative to the binaries that are running on the system
op_to_source tool generates annotated source for assembly listings
op_merge merges profiling samples

Performance Inspector
swtrace ai
run.tprof command to perform the trace and produce the default reports
run.itrace

vtune

dprobes - kernel and load-module debug tracing-type information

tracer - hooks into the kernel and provides tracing information


--------------------------------------------------------------------
--------------------------------------------------------------------
Benchmarks
Component benchmarks are often referred to as microbenchmarks
larger benchmarks are often referred to as application benchmarks or enterprise benchmarks

Operating System Benchmark Tools:
LMbench
AIM7 and AIM9
Reaim
SPEC SDET

Disk Benchmark Tools:
Bonnie/Bonnie++
IOzone
IOmeter
tiobench
dbench

Network Benchmark Tools:
Netperf
SPEC SFS

Application Benchmark Tools:
The Java benchmarks Volanomark, SPECjbb, and SPECjvm
PostMark
Database benchmarks
postfix, included w/ source

Database Benchmark Tools:
Open Source Development Lab
TPC http://www.tpc.org
SPEC benchmarks http://www.spec.org
Oracle Applications Standard Benchmark
SAP Standard Application Benchmark
MySQL, included w/ source

Web Server Benchmark Tools:
SPECweb, SPECweb SSL, and TPC-W
SPECjAppServer and ECPerf


oprofile
http://oprofile.sourceforge.net

performace inspector
http://perfinsp.sourceforge.net

linux trace toolkit
http://www.opersys.com/LTT/

dprobes
http://dprobes.sourceforge.net

vtune
http://www.intel.com/cd/software/products/asmo-na/eng/vtune/index.htm

--------------------------------------------------------------------
Performance Evaluation Methodologies
aka Theory:

-Tracing
-Workload Characterization
-Numerical Analysis
-Simulation

SPECweb99. Representative of web serving performance.

SPECsfs. Representative of NFS performance.

Database query. Representative of database query performance.

NetBench. Representative of SMB file-serving performance

Netperf3. Measures the performance of the network stack, including TCP, IP, and network device drivers.

VolanoMark. Measures the performance of the scheduler, signals, TCP send/receive, and loopback.

Block I/O test. Measures the performance of VFS, raw and direct I/O, block device layer, SCSI layer, and low-level SCSI/fibre device driver.

Lmbench. Used to measure performance of the Linux APIs.

IOzone. Used to measure native file system throughput.

dbench. Used to measure the file system component of NetBench.

SMB Torture. Used to measure SMB file-serving performance.


--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

System Tuning

2.6 Linux Scheduler
nice() system call
priority classes, varying from 0 to MAX_PRIO, where MAX_PRIO=140
The first MAX_RT_PRIO priorities, where MAX_RT_PRIO=100, are set aside for real-time tasks
The remaining 40 priority classes, [100..140], are set aside for time sharing (that is, normal) jobs,
normal jobs, representing the [–20..19] nice value of UNIX processes
-20 (highest priority) to 19 (lowest)

sleep average is a number in the range of [0..MAX_SLEEP_AVG(=10 seconds)]

timeslice is the maximum time a task can run before yielding to another task
range of [MIN_TIMESLICE(=10 milliseconds)..MAX_TIMESLICE(=200 milliseconds)]

STARVATION_LIMIT(=10seconds) times the number of tasks in the run queue

scheduler attempts to keep the system load as balanced as possible
event balancing = rebalance code when tasks change state or make specific system calls
active balancing = specified intervals measure in jiffies
!!KJR "I'll have that done in a jiffie" !!

Active balancing happens at each tick

CHILD_PENALTY - percentage of the parent’s sleep average that a child inherits
Increasing the value of this parameter increases the child’s effective priority

CREDIT_LIMIT - number of times a task earns sleep_avg over MAX_SLEEP_AVG
Reducing the value of this parameter helps highly interactive tasks by raising them to the highly interactive level

EXIT_WEIGHT - penalized for creating children that are processor hogs relative to the parent
Setting this value to zero causes the parent to inherit the child’s sleep average when the child exits

INTERACTIVE_DELTA - determines the offset that is added in determining whether or not a task is considered interactive
parameter is increased, a task needs to accumulate a larger sleep average to be considered interactive

MAX_SLEEP_AVG - A task with this sleep average gets the maximum bonus as indicated by PRIO_BONUS_RATIO
Increasing the value of this parameter gives the highest-priority task more time for execution before it is rescheduled

MAX_TIMESLICE - timeslice that is allocated to the task with the highest static priority (MAX_RT_PRIO)

MIN_TIMESLICE - timeslice that is allocated to the task with the lowest static priority (MAX_PRIO-1)

PARENT_PENALTY - percentage of the sleep average that the parent is permitted to keep

PRIO_BONUS_RATIO - percentage of the priority range used to provide a temporary bonus to interactive tasks

STARVATION_LIMIT - multiplication factor used to decide whether an interactive task is placed in an active or expired array

--------------------------------------------------------------------
--------------------------------------------------------------------
Address Space
The kernel creates the basic skeleton of a process’s virtual address space when the fork() system call is initiated

User Address Space
Each address space is represented in the Linux kernel through an object known as the mm structure
mm structure is a reference counted object that exists as long as the reference count is greater than zero

The VM Area Structures
To circumvent the issue of large page tables, Linux does not represent address spaces with page tables per se, but utilizes a set of VM area structure lists instead
VM area-based approach is that if a process maps a significant number of different files into its address space

Kernel Address Space
vmalloc()
two platform-specific parameters VMALLOC_START and VMALLOC_END
a simple mapping formula (pfn = (addr – PAGE_OFFSET) / PAGE_SIZE)

High-Memory Support
highmem Interface
highmem interface provides indirect access to this memory by dynamically mapping high-memory pages into a small portion of the kernel address space that is reserved for this purpose
kmap()

Paging and Swapping
When accessing a virtual page that is not present, the CPU generates a page fault
The technique of borrowing a page from a process and writing it to the disk subsystem is referred to as paging
swapping—a much more aggressive form of paging that steals not only an individual page, but also a process’s entire page set

Replacement Policy
procedure that determines which page to evict from the main memory subsystem
least recently used (LRU) approach analyzes the past behavior
most UNIX operating systems utilize variations of lower overhead replacement polices such as not recently used (NRU)
Linux relies on an LRU-based approach
not just a replacement policy, but also a memory balancing policy that determines how much memory is utilized for kernel buffers and how much is used to back virtual pages

Page Replacement and Memory Balancing
2 extra bits in each page-table entry = access and dirty bits
access bit indicates whether the page has been accessed since the access bit was last cleared
dirty bit indicates whether the page has been modified since it was last paged in
kswapd clears the access bit

Linux Page Tables
system maintains a page table for each process in physical memory and accesses the actual page tables via the identity mapped kernel segment
Page tables in Linux cannot be paged out to the swap space
per-process page table layout is based on a multilayer tree consisting of three levels
first layer consists of the global directory (pgd)
second layer consists of the middle directory (pmd)
third layer consists of the page table entry (pte)
different memory zones (ZONE_DMA, ZONE_NORMAL, and ZONE_ HIGHMEM)
VM system impacts every other subcomponent in the system

rmap and objrmap
One new VM feature of the Linux 2.6 kernel is referred to as reversed mapping (rmap)
ObjRMAP, the struct page structure utilizes the mapping file to point to an address_space structure describing the object that backs up that particular page

Largepages Support
IA-32 architecture supports either 4KB or 4MB pages
Largepage usage is primarily intended to provide performance improvements for high-performance computing (HPC) and other memory-intensive applications
Linux utilizes either 2MB or 4MB largepages, AIX uses 16MB largepages, and Solaris uses 4MB
translation lookaside buffer (TLB)
number of available largepages can be configured through the proc file system
/proc/sys/vm/nr_hugepage
The core of the largepage implementation in Linux 2.6 is referred to as the hugetlbfs, a pseudo file system (implemented in fs/hugetlbfs/inode.c) based on ramfs
A process may access largepages either through the shmget() interface to set up a shared region that is backed by largepages or by utilizing the mmap() call on a file that has been opened in the huge page file system

Slab Allocator
In Linux 2.4, kmem_cache_reap() is called in low-memory situations
2.6 The set_shrinker() function populates a struct with a pointer to the callback and a weight that indicates the complexity of re-creating the object

VM Tunables
/proc/sys/vm
# cd /proc/sys/vm
# ls
block_dump hugetlb_shm_group min_free_kbytes page-cluster
dirty_background_ratio laptop_mode nr_hugepages swappiness
dirty_expire_centisecs legacy_va_layout nr_pdflush_threads swap_token_timeout
dirty_ratio lowmem_reserve_ratio overcommit_memory vfs_cache_pressure
dirty_writeback_centisecs max_map_count overcommit_ratio

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

I/O Subsystems—Performance Implications

These threads perform raw I/O or are generated by VMM components of the kernel, such as the kswapd or pdflush threads

Scheduler Tunables
/sys/block/device/iosched
Completely Fair Queuing (CFQ) I/O scheduler
Stochastic Fair Queuing (SFQ)

bdflush starts, flushes, or tunes the buffer-dirty-flush daemon

cat /proc/sys/vm/bdflush
50 500 0 0 500 3000 60 20 0

The first parameter (nfract), default 50, governs the maximum number of dirty buffers in the buffer cache
The second parameter (ndirty), default 500, is the maximum number of dirty buffers that bdflush can write to the disk at one time.
The third and fourth parameters are not currently used.
The fifth parameter (interval), default 500, is the delay between kupdate flushes
The sixth parameter (age_buffer), default 3000, is the time for a normal buffer to age before it is flushed.
The seventh parameter (nfract_sync), default 60, is the percentage of buffer cache that is dirty to activate bdflush synchronously
The eighth parameter (nfract_stop_bdflush), default 20, is the percentage of buffer cache that is dirty to stop bdflush
The ninth parameter is not currently used.

echo "100 1200 0 0 500 3000 60 20 0">/proc/sys/vm/bdflush

Setting up Raw I/O on Linux
# ln –s /dev/your_raw_dev_ctrl/.dev/rawctl
raw -a to see which raw device nodes are already in use
raw /dev/raw/raw1 /dev/sda5


--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
Network Tuning

sysctl
/proc/sys/net/core
/proc/sys/net/ipv4

Default Socket Buffer Size
net.core.wmem_default (/proc/sys/net/core/wmem_default)
net.core.rmem_default (/proc/sys/net/core/rmem_default)

........................................
Memory | <=4KB | <=128KB | >128KB
........................................
rmem_default 32KB 64 KB 64KB
wmem_default 32KB 64 KB 64KB
wmem_max 32KB 64 KB 128KB
rmem_max 32KB 64 KB 128KB
........................................

Maximum Socket Buffer Size
net.core.rmem_max (/proc/sys/net/core/rmem_max)
net.core.wmem_max (/proc/sys/net/core/wmem_max)

netdev_max_backlog
net.core.netdev_max_backlog (/proc/sys/net/core/netdev_max_backlog)
The default value is 300, which is typically too small for heavy network loads
Increasing this value permits a larger store of packets queued and reduces the number of packets dropped
dropped packets result in a significant reduction in throughput

somaxconn
net.core.somaxconn (/proc/sys/net/core/somaxconn)
default maximum is 128.

optmem_max
optmem_max (/proc/sys/net/core/optmem_max)
This variable is the maximum initialization size of socket buffers, expressed in bytes

TCP Buffer and Memory Management
net.ipv4.tcp_rmem (/proc/sys/net/ipv4/tcp_rmem)
This variable is an array of three integers:
net.ipv4.tcp_rmem[0] = minimum size of the read buffer
net.ipv4.tcp_rmem[1] = default size of the read buffer
net.ipv4.tcp_rmem[2] = maximum size of the read buffer
......................................................
Default TCP Socket Read Buffer Sizes

Minimum[0] Default[1] Maximum[2]
......................................................
Low Memory PAGE_SIZE 43689 43689*2
Normal 4KB 87380 87380*2
......................................................


tcp_wmem
net.ipv4.tcp_wmem (/proc/sys/net/ipv4/tcp_wmem)
As with the read buffer, the TCP socket write buffer is also an array of three integers:
net.ipv4.tcp_wmem[0] = minimum size of the write buffer
net.ipv4.tcp_wmem[1] = default size of the write buffer
net.ipv4.tcp_wmem[2] = maximum size of the write buffer
........................................................
Default TCP Socket Write Buffer Sizes

Minimum[0] Default[1] Maximum[2]
........................................................
Low Memory 4KB 16KB 64KB
Normal 4KB 16KB 128KB
........................................................

tcp_mem
net.ipv4.tcp_mem[] (/proc/sys/net/ipv4/tcp_mem)
This kernel parameter is also an array of three integers that are used to control memory management behavior by defining the boundaries of memory management zones:
net.ipv4.tcp_mem[0] = pages below which TCP does not consider itself under memory pressure
net.ipv4.tcp_mem[1] = pages at which TCP enters memory pressure region
net.ipv4.tcp_mem[2] = pages at which TCP refuses further socket allocations (with some exceptions)

tcp_window_scaling
net.ipv4.tcp_window_scaling (/proc/sys/net/ipv4/tcp_window_scaling)
employment of TCP window sizes larger than 64K
noted that socket buffers larger than 64K are still potentially beneficial even when window scaling is turned off

tcp_sack
net.ipv4.tcp_sack (/proc/sys/net/ipv4/tcp_sack)
This variable enables the TCP Selective Acknowledgments (SACK) feature
SACK is a TCP option for congestion control

tcp_dsack
net.ipv4.tcp_dsack (/proc/sys/net/ipv4/tcp_dsack)
This variable enables the TCP D-SACK feature
enhancement to SACK to detect unnecessary retransmits

tcp_fack
net.ipv4.tcp_fack (/proc/sys/net/ipv4/tcp_fack)
This variable enables the TCP Forward Acknowledgment (FACK) feature
FACK is a refinement of the SACK protocol to improve congestion control in TCP

TCP Connection Management

tcp_max_syn_backlog
net.ipv4.tcp_max_syn_backlog (/proc/sys/net/ipv4/tcp_max_syn_backlog)
This variable controls the length of the TCP Syn Queue for each port

tcp_synack_retries
net.ipv4/tcp_synack_retries (/proc/sys/net/ipv4/tcp_synack_retries)
This variable controls the number of times the kernel tries to resend a response to an incoming SYN/ACK segment
Reducing this number results in earlier detection of a failed connection attempt from the remote host

tcp_retries2
net.ipv4/tcp_retries2 (/proc/sys/net/ipv4/tcp_retries2)
This variable controls the number of times the kernel tries to resend data to a remote host with which it has an established connection
Reducing this number results in earlier detection of a failed connection to the remote host
This allows busy servers to quickly free up the resources tied to the failed connection
makes it easier for the server to support a larger number of simultaneous connections

TCP Keep-Alive Management

tcp_keepalive_time
net.ipv4.tcp_keepalive_time (/proc/sys/net/ipv4/tcp_keepalive_time)
If a connection is idle for the number of seconds specified by this parameter
the kernel initiates a probing of the connection to the remote host

tcp_keepalive_intvl
net.ipv4.tcp_keepalive_intvl (/proc/sys/net/ipv4/tcp_keepalive_intvl)
This parameter specifies the time interval in seconds between the keepalive probes sent by the kernel to the remote host

tcp_keepalive_probes
net.ipv4.tcp_keepalive_probes (/proc/sys/net/ipv4/tcp_keepalive_probes)
This parameter specifies the maximum number of keepalive probes the kernel sends to the remote host to detect if it is still alive

The default values are as follows:
tcp_keepalive_time = 7200 seconds (2 hours)
tcp_keepalive_probes = 9
tcp_keepalive_intvl = 75 seconds

IP Port Space Range

ip_local_port_range
sysctl.net.ipv4.ip_local_port_range (/proc/sys/net/ipv4/ip_local_port_range)
This parameter specifies the range of ephemeral ports that are available to the system
increasing this range allows a larger number of simultaneous connections for each protocol (TCP and UDP)
systems with more than 128KB of memory, it is set to 32768 to 61000
maximum of 28,232 ports can be in use simultaneously

?? What is a port ??
A port is a logical abstraction that the IP protocol uses as an address of sorts to distinguish between individual sockets
it is simply an integer sequence space

?? what can you tune with sysctl ??

TCP Socket and Buffer Sizes ie. max socket connections
TCP Buffer Sizes ie. max window sizes and network backlogs
TCP Memory Management ie. tcp read write buffes
TCP Connection Management ie. keepalive and intervals
IP Port Space Range

--------------------------------------------------------------------
--------------------------------------------------------------------

What Is Interprocess Communication?

Interprocess communication allows processes to synchronize with each other and exchange data. In general, System V (SysV) IPC facilities provide three types of resources:

Semaphores. Allow processes to synchronize with other and also prevent collisions when multiple processes are sharing resources.

Message queues. Asynchronously pass small data, such as messages, between processes.

Shared memory segments. Provide a fast way for processes to share relatively large amounts of data by sharing a common segment of memory among multiple processes.

In addition to these resources, IPC pipes and FIFOs are among the most commonly used IPC facilities in UNIX-based systems:

Pipes are unidirectional, first-in/first-out data channels that pass unstructured data streams between related processes.

FIFOs (a.k.a. named pipes) are pipes that have a persistent name associated with them.


ipcs -u //resources
ipcs -l //limits



--------------------------------------------------------------------
"Too many open files?" adjust the ulimit

ulimit 1024 is default for linux
ulimit -n 2048

--------------------------------------------------------------------
IDE DISKS ONLY

hdparm /dev/hda
IO_support = 0 (default 16-bit)

put in rc.local
hdparm -c 1 /dev/hda
IO_support = 1 (32-bit)

DMA
hdparm -d 1/dev/hda
--------------------------------------------------------------------


////////////////////////////////////////////////////////////////////
Other Misc questions

/17 network
formula? 2N-2 (where N is the number of bits added to the mask for subnetting)
nodes?
netmask?

2n – 2 available subnets and 2n – 2 available hosts
http://www.pantz.org/networking/tcpip/subnetchart.shtml
CIDR - Classless Inter-Domain Routing ie. /17

3-way Tcp handshake syn-->, <--syn/ack, ack-->
three-way handshake
"SYN" to establish communication and "synchronize" sequence numbers in counting bytes of data which will be exchanged
destination then sends a "SYN/ACK" which again "synchronizes" his byte count with the originator and acknowledges the initial packet
originator then returns an "ACK" which acknowledges the packet the destination just sent him

connection is now "OPEN" and ongoing communication between the originator and the destination are permitted until one of them issues a "FIN" packet, or a "RST" packet, or the connection times out


faster, order 1-4 ?:
? context switch
? read from ram
? read from disk
? read from cpu register

2 to the 11th power = 2048

2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
2^5 = 32
2^6 = 64
2^7 = 128
2^8 = 256
2^9 = 512
2^10 = 1024

How to reduce the sync or seek time for data on a hard disk?

////////////////

know these services and ports:

#ftpd
ftp 21 tcp
ftp-data 20 tcp //data connection
active mode = client connects from a random unprivileged port
passive mode = client initiates both connections to the server
http://slacksite.com/other/ftp.html

#tftpd-hpa
tftp 69 tcp //trivial file transfer protocol

#sshd + scp
ssh 22 tcp
sftp 115/tcp
sftp 115/udp


#in.telnetd typically run through inetd & inetd.conf
telnet 23 tcp

#bind9 aka named
dns 53 udp //querries
53 tcp //dns record transfers

953 rndc control socket bind9

#dhcpd
67 & 68 = bootpc (client) bootps (server)
67 = dhcp Dynamic Host Configuration Protocol
DHCP is based on BOOTP and maintains some backward compatibility
RARP is a protocol used by Sun and other vendors that allows a computer to find out its own IP number
DHCP, like BOOTP runs over UDP, utilizing ports 67 and 68

#apache
http 80 tcp
https 443 tcp //ssl

#mail services...
pop3 110 tcp
pop3s 995 tcp //ssl

imap 143 tcp
imaps 993 tcp //ssl

smtp 25 tcp
smtps 465 tcp //ssl

#databases...
postgres 5432 tcp
mysql 3306 tcp

#nfs
http://nfs.sourceforge.net/nfs-howto/security.html
portmap aka sunrpc
111 udp
portmapper, rpc.statd, and rpc.lockd
mountd
statd, mountd, lockd, and rquotad
nfs 2049/tcp nfsd
nfs 2049/udp nfsd

#auth 113 tcp
host auth stuff
ircd 6667/tcp # Internet Relay Chat
ircd 6667/udp # Internet Relay Chat


#ntp 123 network time protocol
ntpdate

rsync 873/tcp # rsync
rsync -avz /data /data

syslog 514/udp
loghost

snmp 161/tcp # Simple Net Mgmt Proto
snmp 161/udp # Simple Net Mgmt Proto
snmptrap 162/udp snmp-trap # Traps for SNMP


x11 6000/tcp X # the X Window System

#microsoft
netbios-ns 137/tcp # NETBIOS Name Service
netbios-ns 137/udp
netbios-dgm 138/tcp # NETBIOS Datagram Service
netbios-dgm 138/udp
netbios-ssn 139/tcp # NETBIOS session service
netbios-ssn 139/udp
microsoft-ds 445/tcp # microsoft name services
microsoft-ds 445/udp
ms-sql-s 1433/tcp # Microsoft-SQL-Server
ms-sql-s 1433/udp # Microsoft-SQL-Server
ms-sql-m 1434/tcp # Microsoft-SQL-Monitor
ms-sql-m 1434/udp # Microsoft-SQL-Monitor
wins 1512/tcp # Microsoft's Windows Internet Name Service
wins 1512/udp # Microsoft's Windows Internet Name Service

This page is powered by Blogger. Isn't yours?