Discussion:
[RFC][PATCH 2/7] Common code for the physmem map
(too old to reply)
Hariprasad Nellitheertha
2005-03-24 06:03:12 UTC
Permalink
Hariprasad Nellitheertha
2005-03-24 06:03:12 UTC
Permalink
Regards, Hari
Hariprasad Nellitheertha
2005-03-24 06:05:26 UTC
Permalink
Regards, Hari
Hariprasad Nellitheertha
2005-03-24 06:08:54 UTC
Permalink
Regards, Hari
Hariprasad Nellitheertha
2005-03-24 06:08:54 UTC
Permalink
Regards, Hari
Hariprasad Nellitheertha
2005-03-24 06:16:08 UTC
Permalink
Regards, Hari
Hariprasad Nellitheertha
2005-03-24 05:58:24 UTC
Permalink
Regards, Hari
Dave Hansen
2005-03-24 07:53:18 UTC
Permalink
The topic of creating a common interface across
architectures for obtaining system RAM information has been
discussed on lkml and fastboot for a while now.
Sorry, I missed this on LKML.
Kexec needs
information about the entire physical RAM present in the
system while kdump needs information on the memory that the
kernel has booted with.
I think there's likely a lot of commonality with the needs of memory
hotplug systems here. We effectively dump out the physical layout of
the system, but in sysfs. We do this mostly because any memory hotplug
changes generate hotplug events, just like all other hardware. If you
do this in /proc, it's another thing that memory hotplug will have to
update.

Also, we already have a concept of active and non-active physical
memory: we call it online and offline. Some tweaks to the information
that we export might be all that you need, instead of creating a new
interface. I've attached a document I started writing a couple days ago
about the sysfs layout and the call paths for hotplug. It's horribly
incomplete, but not a bad start.

If you want to see some more details of the layout, please check out
this patch set:

http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/patch-2.6.12-rc1-mhp1.gz

A good example of all of the hotplug stuff enabled for a normal machine
is this .config, it boots on my 4-way PIII Xeon.

http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/configs/config-i386-sparse-hotplug

You're welcome to borrow the machine that I normally boot this config
on. Should make booting it relatively foolproof. :)

-- Dave
Hariprasad Nellitheertha
2005-03-24 10:25:08 UTC
Permalink
...
Post by Dave Hansen
I think there's likely a lot of commonality with the needs of memory
hotplug systems here. We effectively dump out the physical layout of
the system, but in sysfs. We do this mostly because any memory hotplug
changes generate hotplug events, just like all other hardware. If you
do this in /proc, it's another thing that memory hotplug will have to
update.
We put it in /proc primarily because what we wanted was
similar in many ways to /proc/iomem and so we (re)use a bit
of the code. Also, we were wondering if it is appropriate to
put in multiple values in a single file in sysfs.
Post by Dave Hansen
Also, we already have a concept of active and non-active physical
memory: we call it online and offline. Some tweaks to the information
that we export might be all that you need, instead of creating a new
interface.
Looks like. And the tweaks could be handled by the user
space kexec-tools.

I've attached a document I started writing a couple days ago
Post by Dave Hansen
about the sysfs layout and the call paths for hotplug. It's horribly
incomplete, but not a bad start.
If you want to see some more details of the layout, please check out
http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/patch-2.6.12-rc1-mhp1.gz
This does not have the sysfs related code. Is there a
separate patch for adding the sysfs entries?
Post by Dave Hansen
A good example of all of the hotplug stuff enabled for a normal machine
is this .config, it boots on my 4-way PIII Xeon.
http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/configs/config-i386-sparse-hotplug
You're welcome to borrow the machine that I normally boot this config
on. Should make booting it relatively foolproof. :)
-- Dave
------------------------------------------------------------------------
block_size_bytes: The size of each memory section (in hex)
This value is per memoryXXXX directory, right?

Regards, Hari
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



-------------------------------------------------------------------------------
Achtung: diese Newsgruppe ist eine unidirektional gegatete Mailingliste.
Antworten nur per Mail an die im Reply-To-Header angegebene Adresse.
Fragen zum Gateway -> ***@inka.de.
-------------------------------------------------------------------------------
Dave Hansen
2005-03-24 15:48:49 UTC
Permalink
Post by Hariprasad Nellitheertha
Post by Dave Hansen
I think there's likely a lot of commonality with the needs of memory
hotplug systems here. We effectively dump out the physical layout of
the system, but in sysfs. We do this mostly because any memory hotplug
changes generate hotplug events, just like all other hardware. If you
do this in /proc, it's another thing that memory hotplug will have to
update.
We put it in /proc primarily because what we wanted was
similar in many ways to /proc/iomem and so we (re)use a bit
of the code.
The code reuse is nice, but the expanded use of /proc is not.
Post by Hariprasad Nellitheertha
Also, we were wondering if it is appropriate to
put in multiple values in a single file in sysfs.
Why would you need to do that?
Post by Hariprasad Nellitheertha
I've attached a document I started writing a couple days ago
Post by Dave Hansen
about the sysfs layout and the call paths for hotplug. It's horribly
incomplete, but not a bad start.
If you want to see some more details of the layout, please check out
http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/patch-2.6.12-rc1-mhp1.gz
This does not have the sysfs related code. Is there a
separate patch for adding the sysfs entries?
Hmmm. I think my rollup script broke. Try this:

http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/broken-out/L0-sysfs-memory-class.patch
Post by Hariprasad Nellitheertha
Post by Dave Hansen
block_size_bytes: The size of each memory section (in hex)
This value is per memoryXXXX directory, right?
No, it's global. However, we have discussed doing it per-section in the
future to collapse some of the contiguous areas into a single directory.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



-------------------------------------------------------------------------------
Achtung: diese Newsgruppe ist eine unidirektional gegatete Mailingliste.
Antworten nur per Mail an die im Reply-To-Header angegebene Adresse.
Fragen zum Gateway -> ***@inka.de.
-------------------------------------------------------------------------------
Hariprasad Nellitheertha
2005-03-28 13:05:20 UTC
Permalink
Post by Dave Hansen
Post by Hariprasad Nellitheertha
Post by Dave Hansen
I think there's likely a lot of commonality with the needs of memory
hotplug systems here. We effectively dump out the physical layout of
the system, but in sysfs. We do this mostly because any memory hotplug
changes generate hotplug events, just like all other hardware. If you
do this in /proc, it's another thing that memory hotplug will have to
update.
We put it in /proc primarily because what we wanted was
similar in many ways to /proc/iomem and so we (re)use a bit
of the code.
The code reuse is nice, but the expanded use of /proc is not.
Post by Hariprasad Nellitheertha
Also, we were wondering if it is appropriate to
put in multiple values in a single file in sysfs.
Why would you need to do that?
Because we are putting the starting address, end address and
the memory type against each entry (just like in
/proc/iomem). Of course, we can figure out the ending
address knowing the starting address and the section size.
Post by Dave Hansen
Post by Hariprasad Nellitheertha
I've attached a document I started writing a couple days ago
Post by Dave Hansen
about the sysfs layout and the call paths for hotplug. It's horribly
incomplete, but not a bad start.
If you want to see some more details of the layout, please check out
http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/patch-2.6.12-rc1-mhp1.gz
This does not have the sysfs related code. Is there a
separate patch for adding the sysfs entries?
http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/broken-out/L0-sysfs-memory-class.patch
In addition to this, I also needed to pull-in the
J-zone_resize_sem.patch to get it to compile.

Would it be possible to make this a separate patch-set so
that it does not depend on memory hotplug.
Post by Dave Hansen
Post by Hariprasad Nellitheertha
Post by Dave Hansen
block_size_bytes: The size of each memory section (in hex)
This value is per memoryXXXX directory, right?
No, it's global. However, we have discussed doing it per-section in the
future to collapse some of the contiguous areas into a single directory.
I tested this on my PIII 256M machine.
/sys/devices/system/memory showed 4 memory sections each of
size 64MB. There are a couple of issues that we noticed. We
will not be able to spot those physical memory areas which
the OS does not use (such as the region between 640k and
1MB). Also, when I booted the system with the mem=100M
option, two entries (memory0 and memory1) turned up. With
block_size_bytes being 64M, this turns out equivalent to a
system with 128M memory.

If block_size_bytes was per-directory, it would be easier in
such situations.

Regards, Hari
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



-------------------------------------------------------------------------------
Achtung: diese Newsgruppe ist eine unidirektional gegatete Mailingliste.
Antworten nur per Mail an die im Reply-To-Header angegebene Adresse.
Fragen zum Gateway -> ***@inka.de.
-------------------------------------------------------------------------------
Dave Hansen
2005-03-28 16:21:59 UTC
Permalink
Post by Hariprasad Nellitheertha
Post by Dave Hansen
The code reuse is nice, but the expanded use of /proc is not.
Post by Hariprasad Nellitheertha
Also, we were wondering if it is appropriate to
put in multiple values in a single file in sysfs.
Why would you need to do that?
Because we are putting the starting address, end address and
the memory type against each entry (just like in
/proc/iomem). Of course, we can figure out the ending
address knowing the starting address and the section size.
That sounds like what you *want* and not what you need :)
Post by Hariprasad Nellitheertha
Post by Dave Hansen
http://www.sr71.net/patches/2.6.12/2.6.12-rc1-mhp1/broken-out/L0-sysfs-memory-class.patch
In addition to this, I also needed to pull-in the
J-zone_resize_sem.patch to get it to compile.
Would it be possible to make this a separate patch-set so
that it does not depend on memory hotplug.
Yes, it's quite possible. However, I've already done this for the
page-migration patches, and I'm not looking forward to doing it again.
If it was as simple as you describe, is there a real reason to break it
out?
Post by Hariprasad Nellitheertha
I tested this on my PIII 256M machine.
/sys/devices/system/memory showed 4 memory sections each of
size 64MB. There are a couple of issues that we noticed. We
will not be able to spot those physical memory areas which
the OS does not use (such as the region between 640k and
1MB). Also, when I booted the system with the mem=100M
option, two entries (memory0 and memory1) turned up. With
block_size_bytes being 64M, this turns out equivalent to a
system with 128M memory.
This turns out to be a minor issue for memory hotplug systems as well,
because it means that you can't add back that last 28MB of memory,
either.
Post by Hariprasad Nellitheertha
If block_size_bytes was per-directory, it would be easier in
such situations.
First of all, I think there are lots of solutions to the problem, not
just changing the scope of "block_size_bytes". We could also present a
value inside of each file that represents which pages in that memory
section area actually online and real RAM. That could be generated from
(slowly) from hardware information like the e820 table. It could be
very slow because the only users would be swsusp and kexec, which aren't
performance-critical.

Also, having variably-sized sysfs objects presents some serious
obstacles for hotplug memory. A memory remove could involve splitting
existing memory areas, and lots of small additions could involve merging
memory ares, just like VMAs.

We haven't implemented either of these things yet, because it hasn't
been necessary, and we don't want to bloat the code. However, if
there's another user, it's a reason to go do it now. Also, it may be a
good idea to move block_size_bytes into the memoryXX directory now, just
in case we need to change it later.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



-------------------------------------------------------------------------------
Achtung: diese Newsgruppe ist eine unidirektional gegatete Mailingliste.
Antworten nur per Mail an die im Reply-To-Header angegebene Adresse.
Fragen zum Gateway -> ***@inka.de.
-------------------------------------------------------------------------------
Loading...