Tuesday, November 20, 2012

Hot Remove a VMDK from LIVE LINUX Virtual Machine


If you want to remove an extra VMDK from a Linux VM,you need to follow these steps.

First you need to unmount the /dev/sdb1:

umount /dev/sdb1

Remove the /disk2 folder:

rmdir /disk2/

Remove the entry from the /etc/fstab:

nano or vi /etc/fstab

remove the following line:
/dev/sdb1               /disk2                  ext2    defaults        1 2

Delete the device:

echo 1 > /sys/block/sdb1/device/delete

Remove the VMDK:

ATTACHING new SCSI HDD (VMDK) to a running Virtual Machine (VM) - Live VM

THIS WILL LIST THE NAME OF THE SCSI HOST PRESENT ON THE SYSTEM, BASED ON WHICH HOST YOU ASSIGNED THE NEW DISK DRIVE TO RUN THE FOLLOWING COMMAND
  983  ls /sys/class/scsi_host

INITIALIZE SCSI
 980  echo "- - -" > /sys/class/scsi_host/host0/scan

RUN FDISK TO VERIFY IN LINUX
  981  fdisk -l

CHECK VAR LOG MESSAGES TO VERIFY THE SCSI DEVICE WAS ATTACHED SUCCESSFULLY
  982  tail -f /var/log/messages

CREATE A NEW VOLUME
  984  fdisk /dev/sde

FORMAT THE NEW VOLUME
  985  mkfs.ext3 /dev/sde1

CREATE A MOUNT POINT
  986  vi /etc/fstab

MOUNT IT
  987  mount /tmp

Monday, November 19, 2012

Thursday, November 15, 2012

A guide to figuring out the voltage range supported by any electronic equipment -by looking at the power cable

UN65ES6550F Can I Run The TVs Power Cable Through The Wall

Samsungs TV's power cables are UL (Underwriters Labs) certified but are not CL (In-wall) rated. A CL rating on a cable (CL2 up to 150 volts, CL3 up to 300 volts) means these cables have a slow burning outer jacket that should meet the fire codes and are safe for in wall installations. Since the power cord for the TV is not CL rated it should never be installed inside a wall.


Wednesday, November 14, 2012

Rescan the SCSI Bus to Add a SCSI Device Without rebooting the VM


Rescan the SCSI Bus to Add a SCSI Device
A rescan can be issued by typing the following command:
echo "- - -" > /sys/class/scsi_host/host#/scan
fdisk -l
tail -f /var/log/message


Format a New Disk

Create partition using fdisk and format it using mkfs.ext3 command:
# fdisk /dev/sdc
# mkfs.ext3 /dev/sdc3

Create a Mount Point And Update /etc/fstab

# mkdir /disk3
Open /etc/fstab file, enter:
# vi /etc/fstab
Append as follows:

/dev/sdc3               /disk3           ext3    defaults        1 2

Save and close the file.


Thursday, November 1, 2012

PowerCLI powershell VMWare Datastore autoload ps1 files profile powershell

Schedule a PowerCLI script as a cronjob or a scheduled task
http://www.virtu-al.net/2009/07/10/running-a-powercli-scheduled-task/


PS1 for Datastore management
  1. http://blogs.vmware.com/vsphere/2012/01/automating-datastore-storage-device-detachment-in-vsphere-5.html
  2. http://vmwaregirl.blogspot.com/2012/03/powercli-get-datastoremountinfo.html
    • List all the SCSI LUN's by the LUNID
    • get-vmhost -state connected | Get-ScsiLun | select-object CapacityMB,RuntimeName,CanonicalName|ft -autosize|findstr "L123

  3. http://snipplr.com/view/48048
  • Create a new VMFS datastore from LUN number (
     New-DatastoreByLun )

Auto Load Powershell script on run.
  1. http://noahcoad.com/post/66/powershell-startup-auto-load-scripts
             set-executionpolicy bypass
       Profile.ps1 under

%UserProfile%\Documents\WindowsPowerShell       Microsoft.PowerShell_profile.ps1 (for basic shell)            
       Microsoft.PowerShellISE_profile.ps1
(for the GUI) 

RDM REPORT - Datastore, RDM, LUN Visibility on Hosts (Node Visibility) output xls
  1. http://www.lucd.info/2010/04/09/lun-report-datastores-rdms-and-node-visibility/

Wednesday, October 24, 2012

Steps for detaching a snapshot VOLUME LUN and creat new volume copy from snapshot on HP P2000 MSA SAN VMWARE POWERCLI CLI



Step1. powerCLI (Only on the host where the VM is running)
    Shutdown & Power off the VM

 
Step2esxCLI (Perform on all hosts)
    a. If the LUN is an RDM, skip to step c. Otherwise, to get a list of all datastores mounted to an ESXi host, run the command:
 # esxcli storage filesystem list
    b. Unmount the datastore by running the command:
 # esxcli storage filesystem unmount [-u <UUID> | -l <label> | -p <path> ]
    c. Detach the device/LUN, run this command:
 # esxcli storage core device set --state=off -d NAA_ID
    d. To list the permanently detached devices: <Lists all detached LUN>
 # esxcli storage core device detached list
    e. Verify that the device is offline, run this command:
 # esxcli storage core device list -d NAA_ID
    f. Running the partedUtil getptbl command on the device shows that the device is not found.
 # partedUtil getptbl /vmfs/devices/disks/naa.?????????


Step3hpCLI
 # show volume-maps
 # unmap volume 10_23_LIS_Daily_C001



Step4esxCLI (Perform on all hosts)
  Rescan all devices on the ESXi host, run the command on all hosts:
    For rescanning vmhba2 only
 # esxcli storage core adapter rescan -A vmhba2
    For rescanning all adapters
 # esxcli storage core adapter rescan -a 


Step5hpCLI
 # show volumes
 # delete volumes 10_23_LIS_Daily_C001  (10 Seconds for 2.6TB partition)



Step6hpCLI
 # show volumes
 # show snapshots
 # show vdisks



Step7.  hpCLI
    To initiate volumecopy from the snapshot
 # volumecopy modified-snapshot no source-volume LIS_Daily dest-vdisk vd01 10_24_LIS_CLONE prompt yes  Success: Command completed successfully. (LIS_Daily) - The volume copy started. (2012-10-24 16:18:53)

    Check volumecopy status
 # show volumecopy-status


Step8.  hpCLI
    Map volume 10_24_LIS_CLONE with
    read-write access for
    HOSTS ESXHS01_vmhba2 and ESXHS02_vmhba2
    Using ports A1,A2 and B1,B2 and
    LUN 101:
 # map volume access rw ports a1,a2,b1,b2 lun 101 host ESXHS01_vmhba2,ESXHS02_vmhba2 10_24_LIS_CLONE
  
    To apply mapping to all hosts (Omit the host parameter, by default it applies to all hosts)
 # map volume access rw ports a1,a2,b1,b2 lun 101 10_24_LIS_CLONE

Tuesday, October 23, 2012

Give me an object lesson than showing me your abusive power

Very impressionable term learnt from a TV series called ALIAS

Give me an object lesson than showing me your abusive power

Monday, October 15, 2012

Steps to solving Rubiks Cube (Home Made recipie)

Bottom 2 lines
3a: Ti Fi TF TR Ti Ri  (Right-side top-center to be moved to Front-Side right-center)

3b: TR Ti Ri Ti Fi TF  (Front-Side top-center to be moved to Right-Side left-center)

Top Center pieces
4 : F R U Ri Ui Fi   

5 : R T Ri T R T2 Ri

  Corners : A - B in place
6 : Ri F Ri B2 R Fi Ri B2 R2 Ui

  Center Top Clockwise
E      R
      F 
7a: F2 U Ri L F2 R Li U F2
   Anti - Clockwise
7b: F2 Ui Ri L F2 R Li Ui F2

Tuesday, October 2, 2012

Bring up interface ethx : Error: No suitable device found: no device found for connection 'System ethx'

 

 
 

Network devices failing to start after MAC address change in RHEL 6


  Remove the file /etc/udev/rules.d/70-persistent-net.rules and reload the network device's module.

  For example:
 
   # rm /etc/udev/rules.d/70-persistent-net.rules
   # rmmod e1000
   # modprobe e1000
   # service network restart


 Why and What it's happening:
   In RHEL6, when udev detects a new network device it runs /lib/udev/write_net_rules to generate /etc/udev/rules.d/70-    persistent-net.rules. This file contains rules to map a MAC address to a specific ethX name persistently. If the MAC    address changes, this file will still reflect the old MAC and thus udev is unable to name the new device to the desired     ethX. By  removing this file and loading the module again, udev will see a device that is not listed and will run the script    again, generating appropriate rules.

Best Practice: VMWARE: Unpresenting a LUN on ESXi 5.0



Delete / Remove dead path http://raj2796.wordpress.com/2012/03/14/vmware-vsphere-5-dead-lun-and-pathing-issues-and-resultant-scsi-errors/


LIST of COMMANDS

~ # esxcli storage core device list|grep off -B12
~ # esxcli storage core device set --state=off -d naa.600c0ff00011d0284ca76c5001000000
~ # esxcli storage core device detached list
~ # partedUtil getptbl /vmfs/devices/disks/naa.600c0ff00011d028364bd54d01000000
~ # esxcli storage core adapter rescan -A vmhba1
~ # esxcli storage core adapter rescan -A vmhba2
~ # esxcli storage core adapter rescan --all


STEPS

  1. Getting the NAA ID of the LUN to be removed
    # esxcli storage vmfs extent list

  1. Unpresenting a LUN from vSphere Client
    • If the LUN is an RDM, skip to next step. Otherwise, in the Configuration tab of the ESXi host, click Storage. Right-click the datastore being removed, and click Unmount.

      Note: To unmount a datastore from multiple hosts, from the vSphere Client select Hosts and Clusters, Datastores and Datastore Clusters view (Ctrl+Shift+D)
    •  Choose the Devices View (Under Configuration > Storage > Devices tab):
      Right-click the NAA ID of the LUN (as noted above) and click Detach. A Confirm Device Unmount window is displayed. When the prerequisite criteria have been passed, click OK.  Perform individually on all hosts

In our case vc wasn’t an option since the hosts were unresponsive and vc couldn’t communicate, also the luns were allready detached since they were never used, so :
list permanently detached devices:
# esxcli storage core device detached list
look at output at state off luns e.g.
Device UID                            State
————————————  —–
naa.50060160c46036df50060160c46036df  off
naa.6006016094602800c8e3e1c5d3c8e011  off
next permanently remove the device configuration information from the system:
# esxcli storage core device detached remove -d
e.g.
# esxcli storage core device detached remove  -d naa.50060160c46036df50060160c46036df

OR

To detach a device/LUN, run this command:
# esxcli storage core device set –state=off -d
To verify that the device is offline, run this command:
# esxcli storage core device list -d


Monday, October 1, 2012

Understanding RHEL daemons (RedHat)

http://magazine.redhat.com/2007/03/09/understanding-your-red-hat-enterprise-linux-daemons/

acpid

This is the daemon for the Advanced Configuration and Power Interface (ACPI). ACPI is an open industry standard for system control related actions, most notably plug-and-play hardware recognition and power management, such as startup and shutdown and putting systems into low poser consumption modes.

You'll probably never want to shut down this daemon, unless you are explicitly instructed to do so to debug a hardware problem.

Learn more:
http://www.acpi.info

anacron

One of the problems with living on a laptop, as so many of us do these days, is that when you set up a cron job to run, you can't always be sure that your laptop will be running at the time that the job should run. anacron (the name refers to its being an "anachronistic cron") gets around this problem by scheduling tasks in days. For example, anacron will run a job if the job has not been run in the specified number of days.

When are you safe not running anacron? When your system is running continuously. Should you simply stop cron from running if you have anacron running? No; anacron is able to specify job intervals in days, not hours and seconds.

Learn more:
http://anacron.sourceforge.net

apmd

This is the daemon for the Advanced Power Management (APM) BIOS driver. The APM hardware standard and apmd are being replaced by ACPI and acpid. If your hardware supports ACPI, then you don't need to run apmd.

atd

This is the daemon for the at job processor (at enables you to run tasks at specified times). You can turn off this daemon if you don't use it.

autofs

This daemon automatically mounts disks and file systems that you define in a configuration file. Using this daemon can be more convenient that explicitly mounting removable disks.

Learn more:
http://freshmeat.net/projects/autofs


Avahi-daemon and avahi-dnsconfd

The Avahi website defines Avahi as: 'a system which facilitates service discovery on a local network. This means that you can plug your laptop or computer into a network and instantly be able to view other people who you can chat with, find printers to print to, or find files being shared…' Avahi is a Zeroconf implementation. Zeroconf is an approach that enables users to create usable IP networks without having special configuration servers such as DNS servers.
A common use of the avahi-daemon is with Rhythmbox, so you can see music that is made available to be shared with others. If you're not sharing music or files on your system, you can turn off this daemon.

Learn more:
http://avahi.org
http://zeroconf.org


more ...here...

Sunday, September 30, 2012

Prevent an upstart / lightdm / gdm service from running at boot on Ubuntu


You can use an override:

    sudo echo "manual" >> /etc/init/lightdm.override  

To start lightdm on command:

    sudo start lightdm  

To restore your system so that lightdm is always started on boot:

    sudo rm /etc/init/lightdm.override  

For more information, the upstart cookbook is your friend:

Overview of RPM commands

Source: http://www.idevelopment.info/data/Unix/Linux/LINUX_RPMCommands.shtml

Purpose
Description / Example
Install an RPM Package
RPM packages have file naming conventions like foo-2.0-4.i386.rpm, which include the package name (foo), version (2.0), release (4), and architecture (i386). Also notice that RPM understands FTP and HTTP protocols for installing and querying remote RPM files.
rpm -ivh foo-2.0-4.i386.rpm  rpm -i ftp://ftp.redhat.com/pub/redhat/RPMS/foo-1.0-1.i386.rpm  rpm -i http://oss.oracle.com/projects/firewire/dist/files/kernel-2.4.20-18.10.1.i686.rpm
Un-install an RPM Package
To un-install an RPM package, we use the package name foo, not the name of the original package file foo-2.0-4.i386.rpm above.
rpm -e foo
Upgrade an RPM Package
To upgrade an RPM package, RPM automatically un-installs the old version of the foo package and installs the new package. It is safe to always use rpm -Uvhto install and upgrade packages, since it works fine even when there are no previous versions of the package installed! Also notice that RPM understands FTP and HTTP protocols for upgrading from remote RPM files.
rpm -Uvh foo-1.0-2.i386.rpm  rpm -Uvh ftp://ftp.redhat.com/pub/redhat/RPMS/foo-1.0-1.i386.rpm  rpm -Uvh http://oss.oracle.com/projects/firewire/dist/files/kernel-2.4.20-18.10.1.i686.rpm
Query all Installed Packages
Use RPM to print the names of all installed packages installed on your Linux system.
rpm -qa
Query an RPM Package
Querying an RPM package will print the package name, version, and release number of the package foo only if it is installed. Use this command to verify that a package is or is not installed on your Linux system.
rpm -q foo
Display Package Information
RPM can display package information including the package name, version, and description of the installed program. Use this command to get detailed information about the installed package.
rpm -qi foo
List Files in Installed Package
The following command will list all of files in an installed RPM package. It works only when the package is already installed on your Linux system.
rpm -ql foo
Which package owns a file?
Use the following command to determine which installed package a particular file belongs to.
rpm -qf /usr/bin/mysql
For example:
    # rpm -qf /usr/bin/mysql  mysql-3.23.52-3
List Files in RPM File
Use RPM to query a (possibly) un-installed RPM file with the use of the the "-p" option. You can use the "-p" option to operate on an RPM file without actually installing anything. This command lists all files in an RPM file you have in the current directory. Also note that RPM can query remote files through the FTP and HTTP protocols.
rpm -qpl kernel-2.4.20-18.10.1.i686.rpm  rpm -qpl ftp://ftp.redhat.com/pub/redhat/RPMS/foo-1.0-1.i386.rpm  rpm -qpl http://oss.oracle.com/projects/firewire/dist/files/kernel-2.4.20-18.10.1.i686.rpm
Verify an Installed Package
Use RPM to list all files that do NOT pass the verify tests (done on size, MD5 signature, etc).
rpm --verify mysql
Where a file does NOT pass, the output is listed using the following codes that signify what failed:
S File size  M Mode (includes permissions and file type)  5 MD5 sum  L Symlink   D Device   U User   G Group   T Mtime
Take for example the following:
# rpm --verify mysql  S.5....T c /etc/my.cnf
This example indicates that file /etc/my.cnf failed on:
File size  MD5 Sum  Modified Time
However, the "c" tells us this is a configuration file so that explains the changes. It should still be looked at to determine what the changes were.
Check an RPM Signature Package
RPM can be used to check the PGP signature of specified packages to ensure its integrity and origin. Always use this command first before installing a new RPM package on your system. Also, GnuPG or Pgp software must be already installed on your system before you can use this command.
rpm --checksig foo

How do you uninstall programs installed using make?



In case if you were wondering how one would go about uninstalling something that was installed with ./configure -> make -> make install. Makefiles don't seems to have a 'remove' section, and make does not seem to have a built in 'sudo make remove <program>' feature.

Your options are
  1. To search the application install folder for a file that may have a comprehensive list of binaries and its locations
  2. Run following command inside the installer directory
    • Su to root and then type #make uninstall
    • $sudo make uninstall

Saturday, September 29, 2012

wireshark change menu fonts xquartz osx

  1. Edit the pre-gtkrc and gtkrc files included in the Wireshark installation. 
    • Open up these files in your favorite text editor, which are located inside/Applications/Wireshark.app/Contents/Resources/themes/Clearlooks-Quicksilver-OSX/gtk-2.0
    • Search for the entry in each configuration file named "gtk-font-name"  
    • Set it to whatever font you want.  NOTE: OpenType fonts don't seem to display as well as TrueType fonts do, 
    • For this example, I've changed this value to "Verdana 10" from the theme default
  2. Enable Font Smoothing
      • Go to /Applications/Wireshark.app/Contents/Resources/etc/fonts 
      • Adding the following text:
          <match target="font">
              <edit mode="assign" name="autohint">
              <bool>true</bool>
              </edit>
          </match>

  3. Restart / Start the wireshark

Friday, September 28, 2012

Installing PUPPET on Red Hat, CentOS and Fedora

To get the latest releses of Puppet, you will need to add the EPEL repository.
 
 
(EPEL is a volunteer-based community from the Fedora project to create a repo of high-quality add-on packages for RHEL and clones)
Further details can be found at http://fedoraproject.org/wiki/EPEL/FAQ#hotouse
 
 
To Install the PUPPET MASTER on Redhat based system
 
1. Install pre-requisit library
 
      root@host# yum install ruby ruby-libs ruby-shadow
 
2a. Install Puppet Master (Server)
 
      root@host# yum install puppet puppet-server facter
 
 
2.b. Install Puppet Agent(Client)
       root@host# yum install puppet facter
 
 

Wednesday, September 26, 2012

Bash Script Template for writing bash utilities

The following code can serve as a template for creating a bash utility.  The purpose of this wrapper utility is to checkout a file, edit in an editor and check the file back in. 


The following is the command usage

root@host# ./rcwrapper [-c "comment goes here"] filename1 [filename2 filename3 ...]

 


#!/bin/bash

#########################

## Author: Nirav Doshi ##

## Date: Sept 25, 2012 ##

## Purpose:

##      Facilitate checkout, edit 

## and checkin process for code viewing, 

##      and editing.

#############################


export EDITOR=vim


while [ $# -gt 0 ]; do  #until run out of parameters

case "$1" in


   -c)

# Move the cursor to next argument which has comment information

shift


# set the comment value

if [ -n "$1"#compare if -c is passed with an argument

then

COMMENT=$1

echo "comment is \"$COMMENT\""

else

COMMENT="no comment supplied"

echo $COMMENT

fi

;;


*)

# Check out the file from repository

echo "CO -c \"$COMMENT\" $1"

# Calls the Editor program varriable along with filename

$EDITOR $1


# Checking file into the repository

echo "CI -c \"$COMMENT\" $1"

;;


esac

 shift

done




Monday, September 24, 2012

How can I install the packages from the EPEL software repository?


There are repository rpm packages for RHEL5 and RHEL6. The repository package installs the repo details on your local system for yum or up2date to use. Then you can install packages with your usual method, and the EPEL repository is included.
For EL5:
su -c 'rpm -Uvh http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm'  ...  su -c 'yum install foo'  
For EL6:
su -c 'rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-7.noarch.rpm'  ...  su -c 'yum install foo'  


http://fedoraproject.org/wiki/EPEL#How_can_I_use_these_extra_packages.3F

Thursday, September 13, 2012

Adding new Initscripts with Red Hat's chkconfig (init scripts) At Boot - autostart

Place the script in /etc/rc.d/init.d and run (as root)

chmod +x /etc/rc.d/init.d/oracle  

to make the script executable. If you are concerned about normal users seeing the script, you could try more restrictive file permissions, as long as the script is executable by root as a standalone script.

Notice the two comments lines in the script:

#chkconfig: 2345 80 05  #description: Oracle 8 Server  

These lines are needed by chkconfig to determine how to establish the initial runlevels to add the service as well as set the priority for the start-and-stop script execution order. These lines denote the script will start Oracle 8 server for the runlevels 2, 3, 4 and 5. In addition, the start priority will be set to 80 while the stop priority will be 05.

Now that the script is in place with the appropriate execute permissions and the required chkconfig comments are in place, we can add the initscript to the chkconfig configuration by typing, as root, chkconfig --add oracle.

Using chkconfig's query feature, we can verify our addition:

[root]# chkconfig --list | grep oracle

 

Wednesday, August 8, 2012

NFS Server setup on LINUX and Mounting NFS on another server

http://tldp.org/HOWTO/NFS-HOWTO/server.html
http://www.cyberciti.biz/tips/ubuntu-linux-nfs-client-configuration-to-mount-nfs-share.html
http://nfs.sourceforge.net/

Troubleshoot at packet level using the following command
#tcpdump -ieth0 -vv -nn -X host 10.9.1.1
  1. Add entry to /etc/exports file (ON NFS SERVER01)


    # <NFSshare dir> <ip address to allow (optional parameters)>
    /archive 10.9.1.0/24(no_wdelay,rw,async)
    /archive/shareHL7 10.9.1.9(rw,sync) 10.9.1.11(rw,sync) 10.9.1.12(rw,sync) 10.9.1.0/24(ro,async) 10.9.1.8(rw,sync)


  2. Test if the share is active on the server

    Client01:# showmount -e server01

  3. On NFS Client01 (Open /etc/fstab file)

    # Server:/<dir>/ <mountpoint> <type> <optional parameters>
    NFSSERVER01:/archive/shareHL7 /archive nfs rw,hard,bg,tcp,nfsvers=3,rsize=32768,wsize=32768 0 0


  4. Allow hosts access TCP (/etc/hosts.allow)
    Add following line

    portmap: 10.9.1.8 , 10.9.1.9 , 10.9.1.11 , 10.9.1.12



  5. Re-read exports file
    You should run the command exportfs -ra to force nfsd to re-read the /etc/exports file


ISSUES: mount to NFS Server failed: RPC Error: Program/version mismatch (retrying).

Test your nfs version by typing
#man nfs
#yum --version nfs
#rpm -qa | grep -i nfs

List all the readily available versions
#rpcinfo -u localhost nfs

  • NFS Versions 2, 3, and 4 are supported on 2.6 and later kernels.
  • NFS over UDP and TCP on IPv4 are supported on the latest 2.4 and 2.6 kernels.


Monday, July 23, 2012

Rebuilding RPMDB (RPM Database) - open rpm file handles

Following with the output, there has been a process being initiated by cron.daily since may 9th and not finishing the execution. The files in SOSREPORT help troubleshoot the issue if you are troubleshooting remotely. [root@BUTWIS01 ~]# ps -ef|grep rpm root 1225 6420 0 Jul12 ? 00:00:00 /bin/sh /etc/cron.daily/rpm root 1233 1225 0 Jul12 ? 00:00:00 /usr/lib/rpm/rpmq -q --all --qf %{name}-%{version}-%{release}.%{arch}.rpm\n root 1353 6255 0 Jun05 ? 00:00:00 /bin/sh /etc/cron.daily/rpm root 1359 1353 0 Jun05 ? 00:00:00 /usr/lib/rpm/rpmq -q --all --qf %{name}-%{version}-%{release}.%{arch}.rpm\n root 1997 6451 0 May21 ? 00:00:00 /bin/sh /etc/cron.daily/rpm root 2000 1997 0 May21 ? 00:00:00 /usr/lib/rpm/rpmq -q --all --qf %{name}-%{version}-%{release}.%{arch}.rpm\n
......
......



The problem appears to have begun on certain date (lets say May 9th); that's the earliest log entry, and /var/log/rpmpkgs is a 0-byte file created on May 10. Unfortunately, we do not seem to have logs stretching back nearly that far on the server, so determining what happened may not be possible. Does the server itself have any 'messages' files in /var/log other than messages and messages.1?

One thing we can see is that each of the temporary files created by the cron job still have open file handles: sort 1360 0 1 unknown /var/log/rpmpkgs.bvNeC1355 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable)
sort 2001 0 1 unknown /var/log/rpmpkgs.AqCXY1999 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 2145 0 1 unknown /var/log/rpmpkgs.CMQBk2143 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 2580 0 1 unknown /var/log/rpmpkgs.unPLs2578 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 3485 0 1 unknown /var/log/rpmpkgs.LbxJe3483 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 4017 0 1 unknown /var/log/rpmpkgs.AiNLk4009 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 6523 0 1 unknown /var/log/rpmpkgs.BEVAC6518 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 6584 0 1 unknown /var/log/rpmpkgs.NfVjU6582 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable) sort 7071 0 1 unknown /var/log/rpmpkgs.blYnX7062 lstat: Resource temporarily unavailable) (stat: Resource temporarily unavailable)

ls -hl /var/log

The fact that all the files still exist in /var/log/messges* suggests this isn't an issue with your storage or filesystem. Unfortunately, the logs still don't go back far enough to tell us what might have happened. The next thing I like to do is collect some data about what the stuck processes are doing. To that end, please run: strace -Tttfvo /tmp/strace.out -p15223 Let that run for 15 seconds or so, then press Ctrl+C and send us /tmp/strace.out.



The strace output shows that rpm process is currently stalled in a futex wait: 15223 14:48:53.284527 futex(0x2ae242d4a6cc, FUTEX_WAIT, 1, NULL <unfinished ...> This usually indicates a problem within rpmdb so perform following steps to rebuild rpmdb. 1, capture current status. # cd /var/lib/rpm # /usr/lib/rpm/rpmdb_stat -CA > /tmp/rpmdb.out 2, Kill the rpm processes by running "killall -9 rpm" 3, back rpmdb then rebuild. # mv /var/lib/rpm/__db.* /tmp # rpm --rebuilddb Running "rpm -qa" or "yum checl-update" should confirm if the rpmdb is back in working state. Read /tmp/rpmdb.out along with the result after you have run above for more insights.

Sunday, July 22, 2012

How-To: Kill a Process Using the 'pidof' Command


If a process hangs and you want to easily kill it, type in a console:

kill -9 $(pidof process_name)

And replace process_name with a currently running process. For example, to kill rpm you would issue the following:

kill -9 $(pidof rpm)

Or awk, as another example:

kill -9 $(pidof awk)

Or

kill -9 $(pidof awk -1)

pidof is a command that finds the process ID (PID) of a given application. What is inside the ( and ) parenthesis is replaced with a certain PID, and the process which has that PID will be killed.

Wednesday, July 11, 2012

Disk Drive - C.P.U - performance measurement on Linux server

(For using tools like iostat, sar[cpu], mpstat[c.p.u usageINFO], nfsiostat[nfsShareStat], nfsstat[nfsShareStat], lpstat[cups-printerStats], vmstat[virtualMemStat])

You need sysstat installed
#yum install sysstat




CURRENT SERVER STATS
=================================
[root@sssl-prime ~]# hdparm -T /dev/sda1
/dev/sda1:
Timing cached reads: 9152 MB in 2.00 seconds = 4581.28 MB/sec

[root@ssssl-prime ~]# hdparm -t /dev/sda1
/dev/sda1:
Timing buffered disk reads: 100 MB in 3.03 seconds = 33.04 MB/sec

[root@ssssl-prime ~]# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 112 MB in 3.04 seconds = 36.85 MB/sec

[root@ssssl-prime ~]# hdparm -t /dev/sda
/dev/sda:
Timing buffered disk reads: 116 MB in 3.00 seconds = 38.62 MB/sec

[root@ssssl-prime ~]# hdparm -t /dev/sda3
/dev/sda3:
Timing buffered disk reads: 108 MB in 3.03 seconds = 35.64 MB/sec


[root@bu 5.8 ~]# hdparm -Tt /dev/sda

/dev/sda:
Timing cached reads: 26204 MB in 2.00 seconds = 13126.82 MB/sec
Timing buffered disk reads: 1312 MB in 3.00 seconds = 437.20 MB/sec

[root@bu 6.2 ~]# hdparm -Tt /dev/sda

/dev/sda:
Timing cached reads: 14174 MB in 2.00 seconds = 7095.22 MB/sec
Timing buffered disk reads: 912 MB in 3.00 seconds = 303.88 MB/sec

Thursday, June 28, 2012

Juju works like a charm

Colplexity of having everything in cloud... Sys admins have to deal with it.

Genericize the product so it's more sellable. Download JUJU to have
abstraction and encapsulate things into a single nugget that devTips
guy can make configurations provisioned so people can use it Ina
generic way, they are called charms.. They were at 40 charms a year
back.

App flower is an open source company and it's software was not easy to deploy.

Juju gives users deployment options of auto deploying charms.

Juju is a cloud abstraction layer but at a different layer than ????


It's developed in python.

Juju.ubuntu.com
Cloud.ubuntu.com
Launch pad.net/juju

#juju on freenode

Clint Byrum. F.l@canonical.com
They are hire ing


He wrote a doc what u need to setup on a distro(Ubuntu) fedora
approaches them too for a porting

Strong code review process almost scurry


Pretty good for ongoing management. And that's the plan toile it
really good management tool. It's like fabric... Allows u to
orchastrate things now... Whereas with juju u have to encode what
needs to be orchastrates now and u make it available for later.

Juju is not for asking system what it's doing..

Fabric is more of a config mgmt tool ara lower level.

No transaction rollback for implementing the charms.

Charms have meta data what it talks to and how it talks to it.

Sent from my iPhone

Wednesday, June 27, 2012

Windows Shares access Error Windows 7 (Troubleshooting Steps)

Unable to access Samba shares from Windows 7 with error


Below are some brainstormed shopping list for the troubleshooting.

Suggestions:

1. Basic IP

Though this is a basic step but worth mentioning. If the error occurs with the server name, try connecting with IP address. If it works, it could be DNS issue.

2. Check Firewall

3. Services


TCP/IP NetBIOS Helper service should be set to Automatic and Started.

Try starting Computer Browser service, if its not.

Try to stop and disable the Routing and Remote Access service, if its started


4. Network Card Binding Order


A. Check the binding order. Go to network connections, go to Advanced menu then select Advanced Settings…


B. Select the network connection you are using and move it to the top


C. Click OK and exit.


5. Enable 'Client for Microsoft Networks'

In network connections, go to the properties of network connection which you are using to connect to the server. Ensure that the 'Client for Microsoft Networks' is checked.



6. Enable NetBIOS over TCP/IP

A. Open the properties of the network connection, select Internet Protocol version 4 (TCP/IPv4) and click on Properties button.

B. On the new page, click on Advanced… button at the bottom.

C. Click on WINS tab and under NetBIOS setting , select Enable NetBIOS over TCP/IP and click OK to exit.


7. Select Authentication level

Check the below mentioned policy on Windows 7: Group policy editor:

Computer Configuration\Windows Settings\Security Settings\Local Policies\Security Options\ Network security: LAN Manager authentication level

Ensure that it is not set to refuse LM & NTLM authentication or set to use NTLMv2 only. To be safe, you can select the following setting which enables LM, NTLM and NTLMv2 authentication: Send LM & NTLM - use NTLMv2 session security if negotiated


Note: Ensure that this policy is not coming from Domain level group policy.

And, if you are using Home or Home premium edition and do not have Group Policy editor then do it in registry:


HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa

Value Name: LmCompatibilityLevel [DWORD]

Set the value to: 1

Reboot your system.



8. SMB Signing

Disable SMB signing and try:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanManServer\Parameters

Value Name: EnableSecuritySignature [DWORD]. Set the value to 1.

Value Name: RequireSecuritySignature [DWORD]. Set the value to 0.



9. Disable SMB 2.0

Disable SMB 2.0 on Windows 7 and try again. Disable SMB 2.0 at client end:
•Open the command prompt (cmd.exe) and type the following two commands:

c:\>sc config lanmanworkstation depend= bowser/mrxsmb10/nsi
c:\>sc config mrxsmb20 start= disabled

Monday, June 25, 2012

VENUE (INDIE DESK - LOS ANGELES DOWNTOWN)

UC riverside (System Admin)

PAGER DUTY (ENTERPRISE LEVEL gateway for alerting phones/email/SMS)
http://www.pagerduty.com/
$18  per user

Nagios is feeding pagerduty...

Graphite and Collectd 

Jenkins (Business workflow automation and reporting)
Model business processes that get forgoten if it was a cron job
- Like if the batch job finished (Central dashboard where you can view)



CDN at Edgecast (Andrew Lientz VP managed service) - Santa Monica
http://www.centreon.com/ (Front end to MYSQL db created by nagios)
Cacti and centreon work well (and still do) for  the basic hardware features 
- Cacti works great for HDD failure reporting
Pintrest (DDOS tool in webbrowser) 120 connection from browser

TUNE TO THE APPLICATION
intro notion of "GRID" [http grid] 
SNMP MIBS into the applications
initial color coding to deal with various features
add versioning and customer information to the status of the server


THE GRIDS
- HTTP
 - flash
 - windows
 - local load balancers
 - log processors

- Used by the 24/7 NOC to watch for application and server issues

Juniper 960 is half a rack in size (telco and CDN buy it)

THE NETWORK
- multiple data centers (do not have a backbone)
   ip transit, peering and dark fiber connectivity

- 24/7 Monitoring by clients using keynote, Gomez and Catchpoint

- Routers and switches with constant packet drops for one reason or another

ROUTING VIEW
- 1st generation monitoring tool inhouse developed (They levarage cacti for that routing view)
 -- has color code system
 -- realtime graphing system

EXTERNAL VIEW
 - monitring and pinging outside servers and the routes to the servers - if a route goes down??


Too MUCH DATA
- better way to dashboard
EVEN WITH
 - 24/7 Mon
 - cnetreon for snmp traps and hardware failures
 - routing views
 - thir party monitring


REDUCE NOISE
 - Focus on warings and alerts
 - Take what we have learned for the grid and put in alarms for each
 

NEW 2nd GEN SERVERS VIEW


THE CUSTOMER
- Refine the tools for the customer
 -- NOC
  - Content owners
 -- Engineering
 -- DevOps (Software rollout)
 -- Capacity Planning (Massive capacity issue, where to bild the next dc)



DASHBOARD? is it built from scratch or some opensource project????


THE END USER
- watching out network isn't enough
 - We need to develop QoS tools
 - Look at all the networks not just the ones we directly connect to
 - Leverage beacons (google analytics - end user measurement) and content provider relationships to give proper end to end measurements


bad first time byte is dns issue
bad last byte time is a route issue


Lance Lakey  lancelakey@gmail.com
Hack night  in Hollywood

@lancelakey on GitHub and twitter

MAtthew King  (Software Engineering TX

Redis

is an open source


Readis
monitor
inspect

Monday, June 18, 2012

SAN Storage - naming convention (SAN disciplines - naming conventions)

http://www.redbooks.ibm.com/abstracts/tips0031.html?Open


Contents

Here are some important factors for using and developing naming conventions in a SAN:

Naming conventions
Use of descriptive naming conventions is one of the most important factors in
a successful SAN. Good naming standards will improve problem diagnostics,
reduce human error, allow for the creation of detailed documentation and
reduce the dependency on individuals.

Servers
Typically, servers will already have some form of naming standard in place.
The local server name is typically used as the host name defined to the disk
system. For the ESS you would normally use the server name in the server
description field. The same local server name can be used within the switch
fabric for zone settings, and whenever possible the use of the server name
should be consistent throughout the SAN.

Cabinets
SAN fabric cabinets should be labeled to adhere with local site standards.

SAN fabric components
A good naming convention for the SAN fabric component should be able to
tell you the physical location, component type, have a unique identifier and
give a description of what it connects to. The following are some descriptor
fields that may be considered when designing a fabric naming convention. If
your SAN only has one vendor type or only one cabinet, the name could be a
lot simpler.

Component description
This should describe the fabric component and the product vendor (for mixed
vendor environments) which will help you locate the management interface
and the component number within the SAN. For example, to give it a unique
identifier you may want to use something similar to the following:

  • Type — Switch (S) Director (D) Gateway (G) Hub (H) Router (R)
  • Vendor — Brocade (B) INRANGE (I) McDATA (M) Vicom (V)
  • Number — 1 - 99

For example, the third Brocade Switch in cabinet one would be:
  • S3 B

Connection description
This should detail what the component is connecting to. For highly available
devices such as the ESS, it is important to understand which cluster side of
the device the component is connected to. This will help prevent potential
mistakes in the SAN design. For devices used to expand the SAN that do not
connect to disk or tape, we will simply identify them as cascade.
  • Connection — Disk (D (for ESS either cluster A or B)), Tape (T), Cascade (C)
  • Number — 1 - 99

To continue our example, the third Brocade Switch in cabinet one connecting
to ESS3 Cluster A would be:
  • S3 B D3A

Physical location
This may be the cabinet descriptor field and, for example, SAN cabinet one
could be C1. For our example this would give us:
  • S3 B D3A C1

We show how our name is developed in the figure below.

Saturday, June 16, 2012

Linux Performance [High Load Alert] Troubleshooting issues on servers (TOOLS to use)

What do you do when you get an alert that your system load is high? Tracking down the cause of high load just takes some time, some experience and a few Linux tools.

CPU LOAD-
  1. uptime (1,5,10 minuits CPU loads)
  2. top
    Cpu(s): 11.4%us, 29.6%sy, 0.0%ni, 58.3%id, .7%wa, 0.0%hi, 0.0%si, 0.0%st  
    • us: user CPU time. More often than not, when you have CPU-bound load, it's due to a process run by a user on the system, such as Apache, MySQL or maybe a shell script. If this percentage is high, a user process such as those is a likely cause of the load.

    • sy: system CPU time. The system CPU time is the percentage of the CPU tied up by kernel and other system processes. CPU-bound load should manifest either as a high percentage of user or high system CPU time.

    • id: CPU idle time. This is the percentage of the time that the CPU spends idle. The higher the number here the better! In fact, if you see really high CPU idle time, it's a good indication that any high load is not CPU-bound.

  • wa: I/O wait. The I/O wait value tells the percentage of time the CPU is spending waiting on I/O (typically disk I/O). If you have high load and this value is high, it's likely the load is not CPU-bound but is due to either RAM issues or high disk I/O.
A little below it tells you which process is hogging the CPU usage.  TOP by default sorts based on cpu usage so the high cpu consuming processes on top.


**NOTE__?: There are instances where an applications spawning up multiple threads on a single CPU server causing the server to have lot of wait cycles and high CPU load average usage.

  1. iostat



Memory LOAD
Check SWAP memory usage

Once all the memory is used up the Swap space is used; usually on a hard drive and is much slower than RAM. Causing processes that load from swap to slow down dramatically. This is a downward spiral causing more wait for other processes and slowing the system to its crawling state.  Its easy to mis diagnose swap issues is high disk I/O.

After all, if your disk is being used as RAM, any processes that actually want to access files on the disk are going to have to wait in line. So, if I see high I/O wait in the CPU row in top, I check RAM next and rule it out before I troubleshoot any other I/O issues.

    Mem: 1024176k total, 997408k used, 26768k free, 85520k buffers  Swap: 1004052k total, 4360k used, 999692k free, 286040k cached  

.This tells us who much swap memory is used and how much is free
  1. more to come....
HIGH DISK I/O Bound LOAD
  1. more to come.....


hxxp:<slash><slash>www.linuxjournal.com<slash>magazine<slash>hack-and-linux-troubleshooting-part-i-high-load

Thursday, June 14, 2012

[puppet] Installation on Red Hat Enterprise Linux RHEL5 RHEL6


Ran into a bit of a problem initially installing 'puppet', the configuration management tool on RedHat. The installation was pretty straight forward, but if you are not using RedHat on a daily bases you may get thrown off.

I was aware enough to at least have EPEL installed on the system as indicated on EPEL article by Redhat folks

Install epel-release rpm package according to your RHEL version as shown below

# rpm -Uvh http://mirrors.xmission.com/fedora/epel/<RHEL version>/<arch>/epel-release-<version>.noarch.rpm  

Links to these rpm packages can be found at http://fedoraproject.org/wiki/EPEL#How_can_I_use_these_extra_packages.3F

Comment

There are many packages included with Fedora that are not included in Red Hat Enterprise Linux. In effort to make certain, high quality packages from Fedora available for Red Hat Enterprise Linux, the Fedora community has created the Extra Packages for Enterprise Linux (EPEL) program.

The EPEL program is a volunteer-run community program. New packages are suggested and added to the program by volunteers.

Packages in the EPEL program are not supported by Red Hat.


import-csv ---> foreach ---> get-aduser --> set-aduser




Pay attention to how the varriable has been used inside -filter parameter which is inside foreach. review article link below to understand better http://www.sapien.com/forums/scriptinganswers/forum_posts.asp?TID=4074


C:\>Import-Csv .\import.csv| foreach{
Get-ADUser -filter "EmailAddress -like '$($_.email)'" | Set-ADUser -OfficePhone $_.phone
}

the whole thing above can be on the same line

sample import.csv file
email, phone
xyz@abc.com, 818-549-12380 x4755


Thursday, April 12, 2012

How to avoid / Kill Zombies (Linux / Unix)

First of all, if you think you can kill Zombies; its ironic. Zombies are already dead. (Referring to processes)

With the above perspective in mind, lets investigate the issue. To find out if you have Zombie processes running on your *nix box. run the following command

# ps -elf|grep Z

This will launch and display a list of processes with Capital 'Z' character on your terminal/console. Let me explain why 'Z'. The 'ps' command returns result with a STAT field which has the current status of the processes. The character 'Z' refers to Zombie process.


ZOMBIE WAR
I don't care, I want the zombies on my system to be vaporized and GONE for ever!
In that case, your best bet is to find out the PPID (the parent process ID) of the zombie process and if possible terminate that parent process. This will result in termination of the zombie process. The other best way on a non-critical system is to reboot during maintenance window.

Are these things harmful for my system? Will they eat my system resources?
The best way to answer this by yourself is think about zombie lifestyle. Zombies don't eat or drink. They simply exist and are grossly - annoying. Similarly, if you have a Zombie processes on your system most likely it is just taking up space in your 'ps' and 'top' output screens. The zombie process is light on system and does not require much resource. It is different for an Orphan process, an orphan process is a fully functional executing processes who's parent has abandoned it and now the Grandparent 'init processes' has the custody of the process.

Causes of Zombie process creation?
It was discovered that there is a script that is being called with "nohup" causing zombie processes in the system. Further, it can be noted that the process could have been called using "nohup" followed by the shell script name / shell command. Since the parent process is not waiting for the child process to finish (Parent proc has abandoned the child proc). At several occasions, the above phenomenon leads to creation Zombie process.

How to avoid this from happening?
To avoid that from happening, best practise suggests usage of "wait" in the script.

MORE INFORMAITON:
"nohup Keeps a command running even after user logs off. The command will run as a foreground process unless followed by &. If you use nohup within a script, consider coupling it with a wait to avoid creating an orphan or zombie process."


the above extract is from

wait info
NAME
wait - await process completion

SYNOPSIS
wait [pid...]

http://pubs.opengroup.org/onlinepubs/7908799/xcu/wait.html