Tuesday, May 14, 2013

Linux and Solaris Day to Day Technical Issues

Issue01:
Solaris 10
/var/adm/messages
 sendmail[24365]: [ID 702911 mail.crit] My unqualified host name (solaris01) unknown; sleeping for retry
 sendmail[24378]: [ID 702911 mail.crit] My unqualified host name (solaris01) unknown; sleeping for retry
 sendmail[24397]: [ID 702911 mail.crit] My unqualified host name (solaris01) unknown; sleeping for retry
 sendmail[24490]: [ID 702911 mail.crit] My unqualified host name (solaris01) unknown; sleeping for retry
 sendmail[24365]: [ID 702911 mail.alert] unable to qualify my own domain name (solaris01) -- using short name
 sendmail[24378]: [ID 702911 mail.alert] unable to qualify my own domain name (solaris01) -- using short name
 sendmail[24397]: [ID 702911 mail.alert] unable to qualify my own domain name (solaris01) -- using short name

/etc/hosts:
57.28.245.105   solaris01        segment1        loghost

Resolution
A: Configured DNS on this server has lost network connectivity.
A: Ref: http://forums.fedoraforum.org/archive/index.php/t-85365.html

Possible questions for this issue:
Q1: What are the other servers which uses sendmail?
Q2: How the /etc/hosts file has been configured on the other server?
Q3: Why since 08 May 2013? Why not before?
Q4: Equivalent stat(Linux) command in Solaris?
Q5: What is the DNS Server entry for this hostname?

Issue02:
RHEL 6.2
dmesg
Performance Events: PEBS fmt1+, Nehalem events, Broken BIOS detected,
complain to your hardware vendor.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)

Resolution:
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c03265132&lang=en&cc=us&taskId=101&prodSeriesId=4268686

Issue03:
RHEL 5.4
/var/log/messages:

kernel: Buffer I/O error on device dm-13, logical block 3989103
kernel: lost page write due to I/O error on dm-13
kernel: Aborting journal on device dm-13.
kernel: Buffer I/O error on device dm-13, logical block 1545
kernel: lost page write due to I/O error on dm-13
kernel: __journal_remove_journal_head: freeing b_committed_data
kernel: ext3_abort called.
kernel: EXT3-fs error (device dm-13): ext3_journal_start_sb: Detected aborted journal
kernel: Remounting filesystem read-only


Commands:
# systool -c fc_host -v
# systool -c fc_transport -v
#dmsetup table
#dmsetup ls --tree
#cat /proc/mdstat
#pvs

Error:

When launching the SANsurfer FC HBA CLI, the following warning messages may appear on the console:
qla2xxx 0000:01:02.0: Unable to read SFP data (102/a0/0).
qla2xxx 0000:01:02.0: Unable to read SFP data (102/a0/0).
qla2xxx 0000:01:02.1: Unable to read SFP data (102/a0/0).
qla2xxx 0000:01:02.1: Unable to read SFP data (102/a0/0).
qla2xxx 0000:01:02.0: Unable to read SFP data (102/a0/0).
qla2xxx 0000:01:02.1: Unable to read SFP data (102/a0/0).
qla2xxx 0000:01:02.0: Unable to read SFP data (102/a0/0).
The driver displays these messages when it is unable to read SFP data. You can safely ignore th


There are three possibilities:
1.    Connectivity issue from the server to the storage
2.    Hard Disk Issue à as this is just a LUN sliced from a bunch of HDD and no errors being logged from the storage side, this may not be the issue.
3.    Improper Multipath Configuration (Procedure: https://access.redhat.com/site/solutions/47894)


Resolution: https://access.redhat.com/site/solutions/35122

Issue04:

Error on dmesg:
Cause:
1.    RAID Controller might be faulty, but it has not logged under “hplog -v” and with “hpacucli ctrl all show config” also it doesn’t show any error and with “cat /proc/drivers/cciss/cciss0” everything looks fine.
2.    File System Error:
a.    Umount the fs
b.    Remove the journals
c.    Add the journals
d.    Remount it.