ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Other Software > Developer's Corner

Real-time OS drivers and their scheduling

(1/2) > >>

superticker:
This is a continuation of another thread:  "What's better: modern built-in motherboard sound chip or old sound card?"

I know NT isn't real-time, but 50ms for an IRQ to be handled sounds ludicrous. And AFAIK, data processing isn't done directly in the IRQ handler, instead some state information is saved and passed down as an IRP, and the IRQ handler itself finishes quickly. Iirc Linux does somewhat the same by having "high" and "low" parts of their IRQ handlers.-f0dder (November 10, 2006, 04:51 PM)
--- End quote ---
In a real-time driver, there is a high, middle, and low part.

The highest (hardware interrupt) part strictly services the hardware, and schedules the software interrupts and their service priorities (usually 0-63).  It may schedule several software interrupts since different parts (I/O initialization, I/O continued service, I/O completion) may require different priorities.  It will also grab the data and cache it if there are any. It's usually about 15 instructions or less.  Of course, its reentrant coding.

The middle tier routines will service the data.  This also must be reentrant coding, so 95% of the system calls can't be made from this level.  Obviously, no C library calls can be made from this level either since the C library isn't reentrant.  If necessary, this level will schedule a completion routine to be executed at the next level.

For the lowest tier (completion routines), the OS does save all processor registers automatically so there's high context-switch overhead entering this tier.  The good news is that your code does not have to be reentrant, so all the system calls are available to you as well as the C library.

It's interesting to note, but the service rates of each tier are highest, 3000 interrupt/sec; middle 300 interrupts/sec; and lowest 30 interrupts/sec.  Note that the maximun service rate of the lowest tier is the same in both the real-time OS as well as the conventional OS.  That's because both have the same context-switch overhead at this level because both are saving/restoring all the registers.

For (real-time) rate monotonic scheduling, we want each completion routine to have its own unique priority so there's deterministic (ordered) execution.  That's why real-time OSes (RTOSes) have so many priorities.

Windows is sluggish at handling interrupts.  I've had problems with National Instruments multifunction I/O cards giving me 50mS service rates and National says there's nothing they can do about it.  I admit these laboratory machines have a lot of I/O going on in them though.  That's why National offers a 486 processor with a Far-Lap OS (RTOS) for real-time control needs on Windows.  Edit: I just realized this was a driver-service problem with several Windows 95 machines.  The "native" Windows 2000 driver model should perform much better.

Hadn't heard about real-time NT, are you sure you're not thinking of NT embedded?-f0dder (November 10, 2006, 04:51 PM)
--- End quote ---
We are definitely talking about the same product.  In 2000, it was called Real-time Windows NT, but now Microsoft is calling it Windows Embedded.  I just visited their website http://msdn.microsoft.com/embedded/windowsxpembedded/default.aspx
It's a scalable version of Windows such that you can scale its memory foot print, which is important.  I think it's still over 500K though when really scaled down, but my information is old on this spec (1997).

Just because something is embedded doesn't mean it has to be hard real-time.-f0dder (November 10, 2006, 04:51 PM)
--- End quote ---
I agree.  It is possible to do hard real-time in software, but I honestly believe hard real-time tasks are better done in hardware today because design tools for FPGAs are so easy to use now.  In addition, some SoC chips (Excalibur) incorporate both a processor as well as an FPGA all on the same chip, so doing both firmware and a gate array design does not increase chip count.

Iirc there's also just one scheduler in the whole of NT, used for both usermode and kernelmode stuff - although there's a distinction between usermode and kernelmode threads. The "scheduler" also isn't a separate modular part, it's interweaved in most of the NT kernel because of it's particular design.-f0dder (November 10, 2006, 04:51 PM)
--- End quote ---
If that's true, then that's really bad design.  Please tell me that's not true.  In the application layer, you have two things to deal with you don't have in the driver layer.  One is protection switches (with the Memory Management Unit, MMU), and the other is semaphore testing and processing--which is really messy and big overhead--in a scheduler.  Some would also include resource awareness (what resources are tied up by awaiting processes), but I'm counting that case under semaphore management here.

In contrast, the driver scheduler has none of this overhead.  That makes it really lean and mean, which is something we really want in all OSes.  The typical OS implementation (and I think Linux works this way), is the let the high overhead application layer scheduler run as a driver-level task in the lowest priority, 63.  All other driver-level tasks run between priorities 0-62 such that when they complete, then the high-overhead scheduler runs.

As for priority levels, there's 32 of them, with one being REALTIME. While that priority isn't strictly "realtime" by computer science terms,...-f0dder (November 10, 2006, 04:51 PM)
--- End quote ---
I follow what you're saying, but I wouldn't look at it that way.  All first tier (hardware interrupt) driver tasks must all complete first.  Afterwards, all second tier driver tasks must compete and there's no special priorities for these.  After that, then priories 0-31 for the main scheduler get attention where priority 0 is the real-time completion routine (which I "think" is swapable like anything else in the application layer, but maybe there's an exception here).  The point is Windows places it's completion routines in protected mode, which means more context-switch overhead (with the MMU) but they would be easier to write and debug than if they were in the driver layer.

Unlike Windows, most OSes require you to reload the entire OS if you enlarge the driver for any reason.  This makes developing in the driver layer inconvenient.  Although placing the completion routine in the application layer means more context-switch overhead (MMU register switches for protected mode), it is handier for development.

Most RTOS application designs don't even have MMU hardware, so doing completion routines in the third tier of the driver layer makes sense since the application layer isn't protected anyway.

f0dder:
Well, purge 9x from your mind - it's ancient and outdated, and is basically a DOS extender on steroids, with large parts of it carried directly from 16bit win3.x, large mutex-protected non-preemptible sections etc. (And yes, 32bit apps on 9x still end up using large 16bit portions; try tracing any GDI call with a kernelmode debugger if you don't believe me :) ).

9x VXD driver model is pretty different from the NT driver model, whereas the "WDM" driver model is mostly some PnP and power-saving additions to the original NT model as far as I can tell.

NT kernel design isn't to farr off your description of a RTOS design - a fundamental difference, though, is that NT doesn't make any hard guarantees. But there's IRQLs (which determines which interrupts are allowed to be serviced, as well as which level of kernel calls that are safe), completion routines, etc.

I've never seen any mention of "real time NT", and afaik even for NT4 it was called "NT embedded" - but I could be wrong. other people have experimented with realtime NT, though, and there's at least one company that have realtime addons for NT.

Now as for scheduling... the NT scheduler schedules threads. Not "processes" or "drivers", but threads - although obviously it does deal with some process data, since thread priority is a mix of base process priority and thread priority, the process owning the current foreground window gets a slight boost, etc.

Generally the scheduler will not scheduler a lower-priority thread if there are higher-priority threads on the ready queue, but do consult "Inside Windows"/"Windows Internals" - it's a pretty good read.

Lots of stuff is pageable in the NT kernel by default, which IMHO is a bad idea - with the amount of RAM machines have had for the last many years, keeping base kernel + drivers locked is a good idea performance-wise. Even back when I had less RAM, "DisablePagingExecutive=1" meant that my system "recovered" faster after quitting some intensive game.

There are kernel-mode threads as well as user-mode threads, for example the "lazy dirty-page writer" and "zero-page provider" are kernel-mode threads.

Iirc an interrupt happens in the context of whatever active thread - which means that often a r3->r0->r3 context switch is needed, while a full register/state preservation and CR3 reloading isn't. This also means that at that IRQL, you can make very few assumptions on which operations are safe to perform.

What I meant about the NT scheduler being "all over the kernel" is based on things like threads blocking on a resource. Instead of having a centralized scheduler that continously checks events to see if they've been triggered and then wake up threads, basically each operation that can cause a thread to unblock will set off the notification itself.

superticker:
Well, purge 9x from your mind - it's ancient and outdated, and is basically a DOS extender on steroids,...-f0dder (November 11, 2006, 08:07 AM)
--- End quote ---

Yes, I agree.  I remember the lab supervisors griping about all the sporadic buffer overrun errors on the data acquisition boards when they pushed their Windows 95 machines.  The fundamental problem is that the driver service rates were too non-deterministic for their data streams.  I gave them two choices: (1) Buy new acquisition hardware with larger FIFO buffers ($1600 each) that would better withstand the bursty Windows service rates, or (2) replace the Win95 OS with Windows NT.  I was pushing solution (1) because I wasn't too sure solution (2) would work in the long run for them.

It doesn't pay to push a system up to its design limits; otherwise, reliability suffers.  They eventually understood that.

... there's at least one company that have real-time addons for NT.-f0dder (November 11, 2006, 08:07 AM)
--- End quote ---
There's two companies.  One approach actually tries to preserve the Win32 API, but supplements its with additional scheduling calls.  If I'm remembering right, both approaches modify the driver model for those hardware devices that require real-time service.

Now as for scheduling... the NT scheduler schedules threads. Not "processes" or "drivers", but threads - although obviously it does deal with some process data, since thread priority is a mix of base process priority and thread priority, the process owning the current foreground window gets a slight boost, etc.-f0dder (November 11, 2006, 08:07 AM)
--- End quote ---

Your terminology may be different than mine.  With my nomenclature, scheduling a process means doing a protection switch (Registers in the Memory Management Unit, MMU, are changed.  A process may even get mapped out of memory.).  In contrast, scheduling a thread requires no protection switch, so time slicing these have much lower context-switch overhead.

Servicing a software interrupt requires no protection switch either because all driver code remains mapped in addressable memory and is not protected from other drivers.  In other words, all drivers are mapped together with a single pair of mapping registers.  Now Windows NT can dynamically load drivers, which means the new driver (just loaded) won't be next to the others.  That's going to require an extra dedicated pair of mapping registers.  But most OSes can't do this and won't support hot-swapping of hardware.  Windows and VxWorks are the only exceptions.  (VxWorks is commonly used in communcations equipment because of its hot-swapping feature.)

But honestly, I don't know how the Windows NT scheduler works.  It may work differently than other OSes when it comes to threads and drivers.  Perhaps threads are protected from each other, but if so, then why call them threads and why have protection switch overhead if the "threaded" code doesn't need it?

Individual Windows processes are separately mapped from each other as expected.  So a protect switch is expected when time slicing between them.

There are kernel-mode threads as well as user-mode threads, for example the "lazy dirty-page writer" and "zero-page provider" are kernel-mode threads.-f0dder (November 11, 2006, 08:07 AM)
--- End quote ---

Everything in kernel mode would use the same pair of MMU mapping registers, so no protection switch is needed.  As a result, it does make sense to use threads for all kernel task switching to reduce context-switch overhead.

Iirc an interrupt happens in the context of whatever active thread - which means that often a r3->r0->r3 context switch is needed, while a full register/state preservation and CR3 reloading isn't. This also means that at that IRQL, you can make very few assumptions on which operations are safe to perform.-f0dder (November 11, 2006, 08:07 AM)
--- End quote ---
So you must write reentrant code for both levels of the Windows driver?  Does this mean that the completion routine (that's running as a real-time task in the application layer) is the only place non-reentrant code is allowed?

f0dder:
In NT, a process is basically a container which holds one or more threads, various objects the process uses (files, sockets, pipes, mutexes, ...), the process memory map and some other stuff.

A thread is register set, as well as a stack, an structured exception chain, and a few other things. (Hm, I guess I should look up whether the thread register set includes CR3 (page table pointer), or if that's taken from the process block).

Each process has a process info block, and each thread has a thread info block.

So the scheduler only schedules threads, since that's the only schedulable entity in the system. It does take various factors into consideration though, and the scheduler isn't only driver by the timer interrupt. If scheduling a thread within the same process, it's about as simple as updating registers, but when the thread lives in a another process, the pagetables etc. has to be updated as well. It's a bit more complicated than this though, especially because of multi-CPU systems.

Servicing an IRQ will allways go to kernelmode, and so will subsequent processing, so you'll always have ring transition overhead (unless the IRQ happens in the context of thread that was already in kernelmode - NT is pretty much fully pre-emptible). But this doesn't involve updating the page table, since no *process* switch needs to be done.

Usermode processes(!) are isolated against eachother, and the kernel is isolated from usermode processes as well. You can't say that *threads* are isolated against eachother, since isolation happens on a process boundary.

Btw, linux supports dynamic load/unload of drivers (or "kernel modules" as they call them) as well, and I'm pretty sure that other OSes do as well - I'd be very surprised if anything with a microkernel design doesn't, actually.

superticker:
In NT, a process is basically a container which holds one or more threads, various objects the process uses (files, sockets, pipes, mutexes, ...), the process memory map and some other stuff....

Each process has a process info block, and each thread has a thread info block.  So the scheduler only schedules threads, since that's the only schedulable entity in the system.-f0dder (November 11, 2006, 11:40 AM)
--- End quote ---
I follow your point of view now, and it does make sense.  My only concern with this approach is that some processes (with fewer threads) may get starved for execution more than others (with many threads) since the scheduler fails to take the process membership of the threads into account when time slicing.  I'm wondering if that's a good thing or a bad thing?  Usually starvation is considered bad, but what about a scheduling bias favoring processes with more threads ready to run?  This is a harder question.  :-\   ... And I need time to think about this.  Perhaps as long as a process isn't totally starved, it's okay to give it less consideration if it has only one thread ready to run.

One of the requirements of an RTOS is the ability to substitute your own scheduler.  Some calls even let you pick the scheduler on their parameter list.  As a result, the scheduler needs to be autonomous from the rest of the OS.  Since the Windows scheduler is so intertwined with OS operations, I doubt you could replace it.

User-mode processes(!) are isolated against each other, and the kernel is isolated from user-mode processes as well.-f0dder (November 11, 2006, 11:40 AM)
--- End quote ---
In most OSes, everything in "kernel mode" (which includes the drivers and the kernel/monitor) are mapped together such that execution can move from one place to another without the overhead a protection switch.  (Yes, that means a bad driver can crash the kernel.)  Does Windows work the same way?

I knows Microsoft signs drivers today, which means they shouldn't crash certain system parts.

BTW, Linux supports dynamic load/unload of drivers (or "kernel modules" as they call them) as well, and I'm pretty sure that other OSes do as well - I'd be very surprised if anything with a microkernel design doesn't, actually.
-f0dder (November 11, 2006, 11:40 AM)
--- End quote ---
Well, you run out of MMU register pairs for each separately protected module that must be constantly mapped into memory.  How many MMU mapping registers does the Pentium processor have?

Now daemon services, even if they are swapped in, don't necessarily always need to be mapped in if you're running out of mapping registers.  In other words, daemon services can share mapping registers.  But critical modules of the microkernel always need to be mapped in to avoid the overhead of protection switches.  So the question is, are there enough dedicated mapping registers for all the microkernel modules and dynamically loading drivers?  An you need to have a few "sharable" mapping registers left over for the application layer and its service daemons.

I haven't ported a protected-mode OS before, so I've never had to write any MMU management code before.  In the embedded systems world, we try to avoid protect mode if it's a small closed system (such as the engine controller in your car).

Navigation

[0] Message Index

[#] Next page

Go to full version